Week 01 Laboratory Exercises

Objectives

to explore input/output functions from the stdio library
to explore the Unix Programmer's Manual
to use the make, autotest and time commands
to perform simple performance analysis

Admin

Demo: in this week's lab, or next week's lab
Deadline: submit by Wednesday 01 January 00:00

Background

Unix/Linux has a large number of useful commands which act as filters and read an input byte- or character-stream, and copy it to an output byte- or character-stream, generally with some changes. The simplest filter is the cat(1) command, which is effectively the identity filter: it copies standard input to standard output, without making any changes at all. If used as, e.g.

cat < /etc/passwd

it will copy the contents of the /etc/passwd file onto the screen.

You can also use the cat(1) command by giving one or more file names, e.g.

cat f1 f2 f3

This will (attempt to) open each of the named files in turn and copy their contents to the screen. If a file does not exist or is not readable, an error message will be written to the screen instead.

Unix/Linux has a large library of functions related to input/output (the standard i/o or stdio.h library). Many functions provide similar kinds of behaviour (e.g., reading a character). The aim in this lab is to write several different versions of a simple cat(1)-like command, using different i/o functions and determining which is the most efficient.

Setting Up

To keep your files manageable, you should do each lab exercise in a separate directory (folder). In your home directory, we would suggest creating a subdirectory called cs1521, and then creating a subdirectory under that called labs, and then subdirectories week01, week02, etc. — something like:

└── cs1521/
    └── labs/
        ├── week01/
        ├── week02/
        ├── week03/
        ├── week04/
        ├── week05/
        ├── week06/
        ├── week07/
        ├── week08/
        ├── week09/
        └── week10/

We will assume that the directory you set up for this lab is lab01dir; you should change into that directory now.

Run the following command(s):

1521 fetch mycat

If you've done the above correctly, you should now find the following files in the directory:

Makefile: a set of dependencies used to control compilation
cat1.c: a stub program for copying input to output
cat2.c: a stub program for copying input to output
cat3.c: a stub program for copying input to output
cat4.c: a stub program for copying input to output

Have a quick look at the cat?.c programs. They're all the same: they provide some boiler plate into which you can embed your code, consisting of a main() function which simply invokes a copy() function to read from an input source and write to an output destination. The first three programs will be used simply to copy the standard input stream to the standard output stream.

Once you've looked at the programs, the next thing to do is to run the make(1) command, which will compile all the code. make(1) prints out the commands it runs, so you should see something like:

make
gcc -Wall -Werror -std=c99   -c -o cat1.o cat1.c
gcc   cat1.o   -o cat1
gcc -Wall -Werror -std=c99   -c -o cat2.o cat2.c
gcc   cat2.o   -o cat2
gcc -Wall -Werror -std=c99   -c -o cat3.o cat3.c
gcc   cat3.o   -o cat3
gcc -Wall -Werror -std=c99   -c -o cat4.o cat4.c
gcc   cat4.o   -o cat4

Since none of the programs do anything yet, you may as well clean them up by running

make clean
rm -f cat1 cat1.o
rm -f cat2 cat2.o
rm -f cat3 cat3.o
rm -f cat4 cat4.o

This removes all of the files created by the compiler and our test system. Look at the manual entry for rm(1) to find out what the -f option does.

Text of the form somename(N) refers to the somename entry in section N of the Unix manual. Unix manual entries are sometimes referred to as man pages, because the command to view them is man(1).

The Unix manual has several sections; common ones we'll see are 1, for 'normal' programs like rm(1); 2, for system calls; and 3, for library functions like fopen(3). There are several other sections — see if you can find out more about them.

On the course website, manual entries link to an online man-page viewer. At the command line, you can use the man command:

man rm

Most of the time, this will pick the right section for you. Sometimes, it won't, and you'll need to choose a specific section:

man 3 printf

We'll be using the manual a lot during this course, and it will likely be available to you during exams. You should definitely become familiar with reading manual entries!

It's worth taking a look at the Makefile to see if you can work out what it's doing. Don't worry if you don't understand it all; you'll see lots of examples of Makefiles in lectures and labs. You will need to run make to recompile the system each time you make changes to the source code files and are ready to test the program again.

Exercise

There are five tasks in this lab. For most of them, you'll need to consult the Unix manual to find out how to use the required functions or commands.

In cat1.c, implement a version of the copy() function that uses fscanf(3) to read one character at a time from the input, and fprintf(3) to print one character at a time to the output.

Hint:
```
man 3 fscanf
man 3 fprintf
```
You can check whether your program is behaving correctly by using the autotest command as follows:
```
1521 autotest mycat cat1
```
In cat2.c, implement a version of the copy() function that uses fgetc(3) to read one character at a time from the input, and fputc(3) to print one character at a time to the output.

Hint:
```
man 3 fgetc
man 3 fputc
```
You can check whether your program is behaving correctly by using the autotest command as follows:
```
1521 autotest mycat cat2
```
In cat3.c, implement a version of the copy() function that uses fgets(3) to read one line at a time from the input, and fputs(3) to print one line at a time to the output. Since fgets(3) requires a buffer, make sure it''s a large one (e.g., using the constant BUFSIZ defined in stdio.h).

Hint:
```
man 3 fgets
man 3 fputs
```
You can check whether your program is behaving correctly by using the autotest command as follows:
```
1521 autotest mycat cat3
```
Once your cat1, cat2, and cat3 commands are working, use the time(1) command to test them on a large input to find out which is the most efficient. A large input file is available in
```
/web/cs1521/19T3/activities/mycat/WarAndPeace.txt
```
If you want to check that your programs can handle this file, try the following commands:
```
./cat1 < /web/cs1521/19T3/activities/mycat/WarAndPeace.txt > /tmp/WP.out
# there should be no output here
diff /web/cs1521/19T3/activities/mycat/WarAndPeace.txt /tmp/WP.out
# if your program is correct, there should be no output here
rm /tmp/WP.out
```
Since you know that your programs are working (don't you?), you don't need to see the large output that will come from the large input; one way to avoid the output is to redirect it to the data sink /dev/null. Run your timing tests via
```
time ./cat1 < /web/cs1521/19T3/activities/mycat/WarAndPeace.txt > /dev/null
```
which will produce output that looks something like
```
real   0m0.920s
user   0m0.908s
sys    0m0.008s
```
Note that you will almost certainly not get numbers anything like these.

The time(1) output reports:
- real time, the total elapsed time the command ran for, but this can be affected by the load on the machine from other processes, and can vary quite a lot from one run to the next;
- user time, the overall time spent in the program code, and will vary less over several runs of the program, but will still show variations; and
- sys time, the time that the operating system spent doing work on behalf of your code, which will also vary a bit over multiple runs of the code.
Also, the times will vary significantly depending on the machine you run the program on. You should do all of the comparative testing on a single machine.

To get a good sense of the relative efficiency of the various versions of cat?, you should run the timing test several times and take an average. Once you have reached a conclusion about which approach is most efficient, use that approach to implement cat4.c.

Copy the most efficient version of copy() into cat4.c, and then modify the main() function to behave as follows:

if there are no command-line arguments,
    call copy with stdin and stdout;
otherwise,
    for each command-line argument,
        attempt to open the named file for reading;
        if it fails to open,
            print "Can't read name-of-file",
        otherwise,
            call copy with the open file and stdout,
            close the file

Use the fopen(3) and fclose(3) functions to access the file given by the command-line argument.

Submission

When you are finished each exercise, make sure you submit your work by running give. You cannot obtain marks by e-mailing work to tutors or lecturers; you must submit via give.

You can run give multiple times; only your last submission will be marked. You check the files you have submitted here.

give cs1521 lab01_mycat

Remember, you have until Wednesday 01 January 00:00 to submit your work.

Automarking will be run by the lecturer several days after the submission deadline for the lab, using different test cases to those available via autotest. You should do your own testing as well as running autotest.

After automarking is run, you can view it here; and the resulting mark will also be available via the give web interface.

Once all components of a lab are automarked, you should be able to view the marks via the give web interface or by running this command on a CSE machine:

1521 classrun sturec