Week 10 Tutorial Questions

  1. How is assignment 2 going?

    Do students who've made progress with the assignment have advice for students not so far along?

    Do students have questions that other may be able to answer?

  2. Why was a call to setlinebuf(stdout) added to main in the supplied code for assignment 2 after the assignment was released?
  3. what is wrong with this line of code written for assignment 2 by a student:
    printf("%s: command not found\n");
    
    return?
  4. Why does the supplied assignment code have this if statement:
        if (strrchr(program, '/') == NULL) {
            // ...
        }
    
  5. Consider the (imaginary) URL

    https://www.cse.unsw.edu.au:443/cs9999/18s2/showMarks?item=quiz2
    

    Identify each of the following components in this URL:

    1. protocol

    2. host

    3. port

    4. path

    5. query

  6. Discuss the code in web_server.c supplied for the serve_web_pages lab exercise.
    // A simple Web server
    
    #include <string.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/types.h>
    #include <sys/socket.h>
    #include <netdb.h>
    #include <unistd.h>
    
    #include "serve_web_page.h"
    
    char *extract_pathname_http_request(char *http_request);
    void handle_request(int client_fd);
    
    int main(int argc, char **argv) {
    
        // create a IPv4 TCP/IP socket
        int sock_fd = socket(AF_INET, SOCK_STREAM, 0);
    
        // reuse the socket immediately if we restart the server
        int option = 1;
        setsockopt(sock_fd, SOL_SOCKET, SO_REUSEPORT, &option, sizeof(option));
    
        struct addrinfo hints = {
            .ai_family = AF_INET,
            .ai_socktype = SOCK_STREAM,
            .ai_flags = AI_PASSIVE,
        };
    
        // construct a port number unlikely to be in use by another COMP1521 student
        // and convert it to a struct addrinfo for the local machine
        struct addrinfo *a;
        char port[6];
        snprintf(port, sizeof port, "%d", 32000 + geteuid() % 32768);
        int s = getaddrinfo(NULL, port, &hints, &a);
        if (s != 0) {
            fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(s));
            exit(1);
        }
    
        // attach the address to the socket
        if (bind(sock_fd, a->ai_addr, a->ai_addrlen) != 0) {
            perror("bind()");
            return 1;
        }
    
        // specify the maximum number of connections that can be queued for the socket
        if (listen(sock_fd, 8) != 0) {
            perror("listen()");
            return 1;
        }
    
        printf("\nFrom your web browser test these URLs:\n");
        printf("http://localhost:%s/example.html\n", port);
        printf("http://localhost:%s/bird.html\n\n", port);
    
        int client_fd;
        while ((client_fd = accept(sock_fd, NULL, NULL)) >= 0) {
            handle_request(client_fd);
        }
    
        close(sock_fd);
    
        return 0;
    }
    
    // handle 1 request for a web page
    void handle_request(int client_fd) {
        // a real server might spawn a client process here to handle the connection
        // so it can accept another connection immediately
    
        char http_request[4096];
        int n_bytes = read(client_fd, http_request, (sizeof http_request) - 1);
        http_request[n_bytes] = '\0';
        printf("Received this HTTP request:\n%s", http_request);
    
        // create a stdio stream to client
        FILE *client_stream = fdopen(client_fd, "r+");
        if (client_stream == NULL) {
            perror("fdopen");
            close(client_fd);
            return;
        }
    
        char *pathname = extract_pathname_http_request(http_request);
    
        if (pathname == NULL) {
            // not a request we can handle
            fputs(HEADER_400, client_stream);
        } else {
            printf("calling serve_web_page(\"%s\")\n", pathname);
            serve_web_page(pathname, client_stream);
            free(pathname);
        }
    
        fclose(client_stream);
    }
    
    // return a malloced string containing pathname from http request
    // NULL is return if not a GET request
    // NULL is returned if .. in pathname
    // NULL is returned if pathname endswith .c
    char *extract_pathname_http_request(char *http_request) {
    
        char *prefix = "GET /";
        if (strncmp(http_request, prefix, strlen(prefix)) != 0) {
            return NULL;
        }
    
        char *path_start = http_request + strlen(prefix);
        char *following_space = strchr(path_start, ' ');
        if (following_space == NULL) {
            return NULL;
        }
        int pathname_length = following_space - path_start;
        if (pathname_length < 1) {
            return NULL;
        }
    
        char *pathname = strndup(path_start, pathname_length);
    
        if (pathname == NULL) {
            return NULL;
        }
    
        if (strstr(pathname, "..") != NULL) {
            // prevent directory traversal
            free(pathname);
            return NULL;
        }
    
        if (pathname_length > 1 && strcmp(pathname + pathname_length - 2, ".c") == 0) {
            // prevent C source being served
            free(pathname);
            return NULL;
        }
    
        return pathname;
    }
    
  7. The include file send_web_page.h supplied for the serve_web_pages lab exercise contains these lines:
    #define HEADER_200 "HTTP/1.1 200 OK\r\nContent-type: text/html\r\n\r\n"
    #define HEADER_400 "HTTP/1.1 400 Bad request\r\n\r\n"
    #define HEADER_404 "HTTP/1.1 404 Not Found\r\n\r\n"
    
    void send_web_page(char *url, FILE *client_stream);
    
    Discuss what they are and when they miught be useful.
  8. Why does a http response header contain a Content-type lines, for example:

    Content-type: text/html
    

    The /etc/mime.types contains lines like this:

    video/quicktime                 qt mov
    
    Discuss how a web server might use this information.
  9. The file /etc/services (on Linux/Unix systems) contains a list of network services and their associated standard port numbers. Using this file, determine which services the following port numbers are associated with:

    1. 21

    2. 22

    3. 25

    4. 80

    5. 101

    6. 443

  10. To connect a socket to a remote service, we also need to know where to connect to. Programming with magic numbers for addresses and ports isn't especially fun; it would be useful to have a way to take symbolic names to these addresses and port numbers — and we've already talked about the mechanisms we can use to do this. And better than trying to talk DNS or parse services(5) ourselves, we can use functions like getaddrinfo(3) to get all this information in a standard way.

    The getaddrinfo(3) function allows us to work out the arguments to give to the socket(2) and connect(2)/bind(2) functions. A simplifed interface to getaddrinfo(3) is:

    int getaddrinfo (char *Hostname, char *Service,
        struct addrinfo *Hints, struct addrinfo **Results);
    

    The function returns a list of struct addrinfo values, where each of these looks like (once again, simplified somewhat):

    struct addrinfo {
        int ai_family;            // protocol family for socket
        int ai_socktype;          // oscket type
        int ai_protocol;          // protocol
        socklen_t ai_addrlen;     // address length
        struct sockaddr *ai_addr; // address
        struct addrinfo *ai_next; // next `struct addrinfo'
    };
    

    Which of the socket functions would each of the fields be used for?

  11. When setting up a socket, we first call socket(2), which has prototype

    int socket (int Domain, int Type, int Protocol);
    

    What do Domain, Type, and Protocol mean? What possible values can each of these parameters take?

  12. The CURL library aims to make the task of fetching web pages simpler. The following code uses the CURL API to fetch the HTTP response from an arbitrary website, given a URL:

    // Fetch data from a URL, separating header and body
    // Adapted from an example on https://curl.haxx.se/
    // Written by Daniel Sternberg
    // Modified by John Shepherd
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    
    #include <curl/curl.h>
    
    static size_t write_data (void *, size_t, size_t, void *);
    
    int main (int argc, char *argv[])
    {
    	// check that they gave us a URL
    	if (argc < 2) {
    		fprintf (stderr, "Usage %s URL\n", argv[0]);
    		exit (1);
    	}
    
    	// init the curl session
    	curl_global_init (CURL_GLOBAL_ALL);
    	CURL *cp = curl_easy_init ();
    	// set URL to get
    	curl_easy_setopt (cp, CURLOPT_URL, argv[1]);
    	// no progress meter please
    	curl_easy_setopt (cp, CURLOPT_NOPROGRESS, 1L);
    	// send all data to this function
    	curl_easy_setopt (cp, CURLOPT_WRITEFUNCTION, write_data);
    	// we want the headers be written to stdout
    	curl_easy_setopt (cp, CURLOPT_HEADERDATA, stdout);
    	// we want the body be written to stdout
    	curl_easy_setopt (cp, CURLOPT_WRITEDATA, stdout);
    	// get it!
    	curl_easy_perform (cp);
    	// cleanup curl stuff
    	curl_easy_cleanup (cp);
    
    	return 0;
    }
    
    static size_t
    write_data (void *ptr, size_t size, size_t nmemb, void *stream)
    {
    	size_t written = fwrite (ptr, size, nmemb, (FILE *) stream);
    	return written;
    }
    

    Modify it so that the HTTP header and the body of the response are placed in separate files (called head.txt and body.txt respectively).

  13. The Practice Prac Exam involves two programming questions: complete one MIPS program, and complete one C program.

    1. How should you read each question?

    2. How should you read the supplied code?

    3. How much time should you spend on each question?

    4. What happens if my program doesn't pass any/all check tests?

  14. Under what scenarios would the following transport-layer protocols be suitable:

    1. TCP, Transmission Control Protocol

    2. UDP, User Datagram Protocol

  15. What are the IP address(es) of the following hosts:

    1. www.cse.unsw.edu.au

    2. williams.cse.unsw.edu.au

    3. moss.stanford.edu

    4. oxford.ac.uk

    5. gaia.cs.umass.edu

    6. kora01.orchestra.cse.unsw.edu.au

    7. kora02.orchestra.cse.unsw.edu.au

    8. drum01.orchestra.cse.unsw.edu.au

    9. drum02.orchestra.cse.unsw.edu.au

    Can you spot any patterns in the addressing of CSE lab machines?

  16. What is the purpose of name resolution on the Internet, and how is it accomplished?

  17. Consider the following (working) MIPS code to find the maximum value in an array:

        # clean up stack frame
        lw    $s3, -20($fp)
        lw    $s2, -16($fp)
        lw    $s1, -12($fp)
        lw    $s0, -8($fp)
        lw    $ra, -4,($fp)
        la    $sp, 4($fp)
        lw    $fp, ($fp)
        jr    $ra           # BUG: try removing this
    

    You can grab a copy of this code as max.s.

    The part that you are required to write (i.e., would not be part of the supplied code) is highlighted in the code.

    This implements the following C algorithm:

    int max (int a[], int n)
    {
    	int big = -1;
    	for (int i = 0; i < n; i++) {
    		if (a[i] > big) big = a[i];
    	}
    	return big;
    }
    

    Change the code to make it incorrect. Run the code using the command:

    spim -file max.s
    

    to see what errors it produces. Then use QtSPIM to identify the location where the code "goes wrong".