Puzzle 1: Socket programming - Solution

This document will give you some advise on things you may have done wrongly in the first homework. It does not contain a complete source code for the proxy. Use what you find here as a guideline to clean up your proxy - if needed. A complete dummy proxy with all error checks should be around 250 lines of C code.

Strings and buffers

The main problem students seem to have had to do with other things than Socket programming. First of all people had problems with writing C/C++ code. Common mistake was to use strlen to determine how much of the buffer you should write on the other socket. This is of course wrong. The function strlen locates the first '\0' in a buffer and returns the distance to that  character. Using strlen is wrong for many reasons. First of all does read not put a '\0' character at the end of the buffer. Some documents, such as images, may contain '\0' character as normal data. Also this is quite unnecessary since read returns how much data was put into the buffer.

Break your code up into smaller pieces

Another common mistake was a poor software engineering. In the guidelines for the assignment we recommended that you would break the programming into 3 phases: server, client and finally the proxy. Socket programming is simple but it requires some discipline, and it is easy to make mistakes. The best way of writing complicated socket programs is to break it into smaller pieces and make sure each piece works as it is supposed to. This requires of course good testing of each component. It was too common that people had problems in the server part of the proxy and claimed they first did program the server separately.

In the next assignment we will provide a source code for a server and a client.

The select loop

Many students had severe problem implementing the select loop. It seemed that many students did not fully understand what was happening in each of the three system-calls used: select, read, and write. Whenever select realizes that read (in this case) should not be blocking anymore it returns with value larger than 0 (number of sockets that should not block). Some of you used timeouts and on timeouts select returns 0.

Then all you need to do is to check which socket are of interest by using the FD_ISSET macro. Then your read all data on that socket and whatever you read you have to write to the other socket. If read read something it returns a value larger than 0. On "end-of-file" or "end-of-connection" read returns 0. On error it returns -1. One error is special here: EWOULDBLOCK. This is telling you that read failed because you do not want to block, thats fine. On all other errors you simply break out of the select loop. To make the code clearer we have a function write_n_bytes that tries to write len bytes of the buffer.  Upon an error that function returns -1.

// Make the sockets nonblocking
fcntl( fdWebServerSocket, F_SETFL, FNDELAY );
fcntl( fdBrowserSocket, F_SETFL, FNDELAY );

// The select loop
int len;

char szBuffer[1024];
fd_set fdsRead;
FD_ZERO( &fdsRead );
FD_SET( fdWebServerSocket, &fdsRead );
FD_SET( fdBrowserSocket, &fdsRead );
while ( select( 10, &fdsRead, NULL, NULL, NULL ) > 0 ) {
    if ( FD_ISSET( fdBrowserSocket, &fdsRead ) ) {
        while ( (len = read(fdBrowserSocket, szBuffer, sizeof(szBuffer))) > 0 ) {
            if ( write_n_bytes( fdWebServerSocket, szBuffer, len ) < 0 )

                break;
        }
        if ( (len == 0) || ((len < 0) && (errno != EWOULDBLOCK)) )
            break;
    }
    if ( FD_ISSET( fdWebServerSocket, &fdsRead ) ) {
        while ( (len = read(fdWebServerSocket, szBuffer, sizeof(szBuffer))) > 0 ) {
            if ( write_n_bytes( fdWebServerSocket, szBuffer, len ) < 0 )

                break;
        }
        if ( (len == 0) || ((len < 0) && (errno != EWOULDBLOCK)) )
            break;
    }
    FD_ZERO( &fdsRead );
    FD_SET( fdWebServerSocket, &fdsRead );
    FD_SET( fdBrowserSocket, &fdsRead );
}
close( fdBrowserSocket );
close( fdWebServerSocket );

Testing

Finally we would like to point out some problems you had testing your code. Most students seemed only to test their proxy by opening as many pages as possible. Up to some point this is fine as the first test. It was clear that not all pages would work since we did not change the request (this was supposed to make your life simpler but I think now we made a mistake here). Think about what opening a page with many images is testing. First you get the page. Then the browser opens a connection for each image on the page. Some of them may be located elsewhere - and maybe on a server that does not work with our dummy server. So all you are testing is opening many connections. If that is what you wanted to test then that is fine but many students used this as a general test for the proxy.

When you design a test for your program you should plan carefully what you are testing. Here are some suggestions on what to test in a proxy like this. First you should just open a single page. Then you should reload that page repeatedly - to test opening a one page after another (note that it does not matter that it the same page you are loading all the time). Load a binary data like a image. Should catch errors like using strlen (see above). Then load huge images, slower server is better. This tests your select loop. Then while loading the image, close the connection in the browser. This tests excepetion handling in your select loop.

Use debuggers (like gdb) to track down errors. Also step once through your code in a debugger when you think it is working correctly.

What to do before doing the next assignment

In the next assignment you will port your proxy to a network simulator. If your code is correct this should be easy and take not too much time. The main goal of the assignment is to learn how to use the simulator. The following assignments will all be done in the simulator. If you think your proxy is to unorganized or complicated you should spend some time cleaning up your code. That will make all error tracking so much easier.


This page was last updated on  01/01/02 06:40 PM