CS 6410: Advanced Systems (Fall 2016)
MP 1 - TCP proxy

Introduction

The first programming project is meant to introduce you to socket programming, as well as the Unix software development environment (gcc, make, etc.) and pthread programming. You may find it useful to refer to this document for socket programming (Beej's Guide to Network Programming).

Your task will be to write a TCP Proxy. You'll learn how to write both client and server code in this mini-project.

A TCP proxy server is a server that acts as an intermediary between a client and another server, called the destination server. Clients establish connections to the TCP proxy server, which then establishes a connection to the destination server. The proxy server sends data received from the client to the destination server and forwards data received from the destination server to the client. Interestingly, the TCP proxy server is actually both a server and a client. It is a server to its client and a client to its destination server.

A TCP proxy server can be useful to get around services which restrict connections based on the network addresses. For example, the web page http://fireless.cs.cornell.edu/courses/2016fa/cs6410/restricted/ is only accessible from Fractus hosts. If you try to access it from elsewhere, you will receive an access denied error. However, you can view this page from a web browser anywhere on the Internet by running a proxy server on one of the Fractus instance machines. The web server will think it is serving the data to a web client on the machine running the proxy. However, the proxy is forwarding the data out of the class network, thus subverting the protection mechanism.

The assignment

The proxy server you will build for this mini-project will be invoked at the command line as follows:

# ./tcp-proxy destination-host destination-port listen-port

For example, to redirect all connections to port 3000 on your local machine to yahoo's web server, run:

# ./tcp-proxy www.yahoo.com 80 3000 

As another example, to view the restricted web page mentioned above, you might run the following command on your Fractus VM:

# ./tcp-proxy fireless.cs.cornell.edu 80 4000 
Then in another terminal of your Fractus VM, you can view the restricted web page by running
# curl http://localhost:4000/courses/2016fa/cs6410/restricted/ 
You can also type the URL http://128.84.105.XXX:4000/courses/2016fa/cs6410/restricted/ into your browser window, provided that 128.84.105.XXX is the public IP address of your Fractus instance, and that you have authorized network access on the proxy listen-port (-p 4000).

Note: Fractus is behind the firewall of the CS Department. You need to use the Cisco VPN if you are outside the Cornell network.

Milestone (Due Friday, Sep 9, 11:59PM)

Submit a single threaded version, which accepts a single connection from a client and forwards it using a single connection to the server. During the connection, the proxy might not accept any other connections. This version will not be graded, but we will give you some feedback.

Final Submission (Due Friday, Sep 16, 11:59PM)

The proxy server will accept connections from multiple clients and forward them using multiple connections to the server. No client or server should be able to hang the proxy server by refusing to read or write data on its connection. For instance, if one client suddenly stops reading from the socket to the proxy, other clients should not notice interruptions of service through the proxy. In order to do so, you will need to use multiple pthreads. However, in this mini-project, the maximum number of threads you can use is limited by five (One main thread that accepts connections, and four worker threads that handle active connections). Then, you will need to use select, or poll within each thread to handle multiple connections.

Specifications

The proxy must behave as followings: Lastly, you must carefully handle memory operations. There must be no memory leaks, dangling pointers, buffer overflows, and any vulnerabilities that C might introduce. Any bug related to memory operations will be counted negatively to your credit.

Fetching and building the source

You should re-use the VM created for mp0. To check those VMs that you have created, use the following command:

euca-describe-instances --filter key-name=kp-netid-xxx
Before booting up the VM again, we need to first change its instance type to m3.xlarge (with 4 CPUs and 2GB memory), because by-default its instance type is m1.small (with 1 CPU, 4GB memory). To change its instance type, use the following command:
euca-modify-instance-attribute -t m3.xlarge i-xxxxxxx
Here i-xxxxxxx is your instance ID, which can be obtained by using the euca-describe-instances command. After chaing its type, you can now start the VM again:
euca-start-instances i-xxxxxxx

Remember to mark on the shared Google doc that you are now using your VM.

Start by downloading the skeletal code from CMS (tcp-proxy.tar.gz), then copy it (using scp/rsync) to your home directory on your Fractus VM (recall that prompt> denotes your own machine, while # denotes the Fractus instance). You should be able to get the files from your local box to your instance and build like this:
prompt> scp -i ~/.euca/id-rsa-kp-zs272-test tcp-proxy.tar.gz root@128.84.105.XXX:~/
prompt> ssh -i ~/.euca/id-rsa-kp-zs272-test root@128.84.105.XXX
# tar --no-same-owner -xzf tcp-proxy.tar.gz
# cd tcp-proxy
# make
gcc -pthread -o tcp-proxy tcp-proxy.c 
If you work as a root, it is very important you use the --no-same-owner flag for tar, since you are acting as root on the Fractus machine, and otherwise tar will change permissions (sadly propagating up the entire /root/tcp-proxy path) to the user ID of your local machine (the machine the scp command was initiated from), which does not necessarily exist on the Fractus instance (unless you fancy being root on your box). As a result, subsequent SSH connections will fail, since the /root home folder of user root has just been owned by a rogue user ID.

Make sure that you save your work before shutting down your instance, either by placing it in a bucket, using a version control system like (CVS, SVN, Git, darcs, etc.), or simply fetching it back on your machine. To fetch it back on your machine, follow the steps:
# cd ~/tcp-proxy
# make dist
prompt> scp -i ~/.euca/id-rsa-kp-zs272-test root@128.84.105.XXX:tcp-proxy/tcp-proxy.tar.gz .

Do not include any files associated with version control systems in your tar ball.

That's it! You've now built tcp-proxy. To test it, type, for example:
# ./tcp-proxy www.yahoo.com 80 1234
Now you should test your program using telnet. In the new window, run:
# telnet localhost 1234
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Connection closed by foreign host.
# 
The message "Connected to localhost" says that your proxy accepted a TCP connection, but then immediately closed it, since the proxy is not fully implemented. You must finish implementing the proxy. You are free to use any basic C library, or you can design your own data structures. It is possible to complete the assignment without using any other external libraries or data structures.

Simple Tests

You should test your proxy to make sure that it continues to forward data even when some connections aren't responding. Here's one test you should be able to pass.

First, run the proxy and point it at fireless.cs.cornell.edu's HTTP port:

# ./tcp-proxy fireless.cs.cornell.edu 80 1234
Now, in another window, use telnet to fetch /big through the proxy:
# telnet localhost 1234
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET /courses/2016fa/cs6410/mps/big
Watch the data go by for a while, then interrupt the output by typing control-], after which telnet should stop and print telnet>. Now check that the proxy hasn't been hung because telnet isn't reading data; suspend your telnet by typing ``z RETURN'', wait for 10 seconds, and fetch something else:
telnet> z

Suspended
# telnet localhost 1234
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET /courses/2016fa/cs6410/mps/small
You got it!
Connection closed by foreign host.
If you see "You got it!," your program passes the test.

Now try to access the restricted page from your web browser with a URL like http://128.84.105.XXX:1234/courses/2016fa/cs6410/restricted/. Again, make sure 128.84.105.XXX is the IP address of your Fractus instance running your tcp-proxy, and that you have authorized access on port 1234 to it.

Next, lower the maximum number of allowed concurrent proxied connections to something like 2, and test by pointing your proxy to fireless.cs.cornell.edu, port 80, just like in the first test. Start by opening 3 telnet connections, but without issuing the HTTP GET. The third connection should not be accepted. Now issue GET /small in one of your two connected telnet prompts, and once you received the responce from the server your third connection should be accepted --- HTTP web servers orderly terminate the connection to indicating that the end of file was reached.

Test with Concurrent Connections

There are a lot of tools for testing network performance using TCP. Do some research to find your own tools. Here as an example, we use iperf, a benchmarking tool for TCP and UDP protocols. You can checkout this website for more details. Installing iperf is as easy as running the following commands:
    # apt-get update
    # apt-get install -y iperf
Start the iperf server:
    # iperf -s -p 5001
And then in another terminal of your VM, start your TCP proxy:
    # ./tcp-proxy localhost 5001 80
Now open another terminal, start the iperf client and test the performance:
    # iperf -c localhost -p 80 -t 10 -P <thread_number>
An example with 4 threads:
    # iperf -c localhost -p 80 -t 10 -P 4
    ------------------------------------------------------------
    Client connecting to localhost, TCP port 80
    TCP window size:  648 KByte (default)
    ------------------------------------------------------------
    [  6] local 127.0.0.1 port 50190 connected with 127.0.0.1 port 80
    [  5] local 127.0.0.1 port 50188 connected with 127.0.0.1 port 80
    [  4] local 127.0.0.1 port 50186 connected with 127.0.0.1 port 80
    [  3] local 127.0.0.1 port 50184 connected with 127.0.0.1 port 80
    [ ID] Interval       Transfer     Bandwidth
    [  6]  0.0-10.0 sec  10.2 GBytes  8.73 Gbits/sec
    [  5]  0.0-10.0 sec  10.6 GBytes  9.14 Gbits/sec
    [  4]  0.0-10.0 sec  10.9 GBytes  9.36 Gbits/sec
    [  3]  0.0-10.0 sec  10.7 GBytes  9.23 Gbits/sec
    [SUM]  0.0-10.0 sec  42.4 GBytes  36.5 Gbits/sec
On the server side, we can see this output:
    # iperf -s -p 5001
    ------------------------------------------------------------
    Server listening on TCP port 5001
    TCP window size: 85.3 KByte (default)
    ------------------------------------------------------------
    [  8] local 127.0.0.1 port 5001 connected with 127.0.0.1 port 40240
    [  4] local 127.0.0.1 port 5001 connected with 127.0.0.1 port 40242
    [  5] local 127.0.0.1 port 5001 connected with 127.0.0.1 port 40244
    [  6] local 127.0.0.1 port 5001 connected with 127.0.0.1 port 40246
    [  4]  0.0-10.0 sec  10.9 GBytes  9.34 Gbits/sec
    [  6]  0.0-10.0 sec  10.2 GBytes  8.71 Gbits/sec
    [  8]  0.0-10.0 sec  10.7 GBytes  9.21 Gbits/sec
    [  5]  0.0-10.0 sec  10.6 GBytes  9.13 Gbits/sec
    [SUM]  0.0-10.0 sec  42.4 GBytes  36.4 Gbits/sec

We can see that iperf reports the throughput for each thread, and the total throughput. You may notice that the throughput reported on the server side is different with that on the client side. This is because we have a TCP proxy in between which has some buffering mechanism.

Play with different numbers of threads to see if your TCP proxy can handle it efficiently.


How/What to hand in

TCP proxy

You should submit two things: You should build the software distribution with the make dist command, as follows:
# cd tcp-proxy
# make dist
rm -fr .DS_Store *.tar.gz *.ps *.pdf *.o *.dSYM *~ tcp-proxy test-tcpproxy
tar -czf tcp-proxy.tar.gz ../tcp-proxy --exclude=tcp-proxy.tar.gz --exclude=".svn" 
tar: Removing leading `../' from member names
# tar -tzf tcp-proxy.tar.gz 
tcp-proxy/
tcp-proxy/list.h
tcp-proxy/Makefile
tcp-proxy/tcp-proxy.c
# md5sum tcp-proxy.tar.gz
0cedb78a282b2543cdd412061cac6894  tcp-proxy.tar.gz
The last command computes the MD5 checksum of your tar file, which can be used to verify your submission (CMS provides the MD5 checksum after you submit a file). To turn in your distribution, upload the tcp-proxy.tar.gz and readme.txtfile on CMS.

If you have any problems about submission, please contact the TAs.


Useful tips


This page was originally created by Tudor Marian.