The first programming project is meant to introduce you to some programming tools you'll be using for the rest of the course, particularly the Unix software development environment (gcc, make, etc.) and, to a lesser extent, the
C asynchronous event library (indirectly a wrapper on top of
select
). You may find it useful to refer to this
tutorial.
Your task will be to write a TCP Proxy using the C asynchronous library. You'll learn how to write both client and server code in this lab.
A TCP proxy server is a server that acts as an intermediary between a client and another server, called the destination server. Clients establish connections to the TCP proxy server, which then establishes a connection to the destination server. The proxy server sends data received from the client to the destination server and forwards data received from the destination server to the client. Interestingly, the TCP proxy server is actually both a server and a client. It is a server to its client and a client to its destination server.
A TCP proxy server can be useful to get around services which restrict connections based on the network addresses. For example, the web page
http://fireless.cs.cornell.edu/courses/2013fa/cs6410/restricted/ is only accessible from EC2
XXX.compute-1.amazonaws.com
hosts. If you try to access it from elsewhere, you will receive an access denied error. However, you can view this page from a web browser anywhere on the Internet by running a proxy server on one of the EC2 instance
machines. The web server will think it is serving the data to a web client on the machine running the proxy. However, the proxy is forwarding the data out of the class network, thus subverting the protection mechanism.
The proxy server you will build for this lab will be invoked at the command line as follows:
# ./tcp-proxy destination-host destination-port listen-port
For example, to redirect all connections to port 3000 on your local machine to yahoo's web server, run:
# ./tcp-proxy www.yahoo.com 80 3000
As another example, to view the restricted web page mentioned above, you might run the following command on your EC2 machine:
# ./tcp-proxy fireless.cs.cornell.edu 80 4000
Then you can view the restricted web page by typing the URL
http://ec2-75-101-184-233.compute-1.amazonaws.com:4000/courses/2013fa/cs6410/restricted/ into your browser window, provided that
ec2-75-101-184-233.compute-1.amazonaws.com
is the public DNS name returned by the
ec2-runinstances
command, and that you have
authorized network access on the proxy listen-port (-p 4000).
The proxy server will accept connections from multiple clients and forward them using multiple connections to the server. No client or server should be able to hang the proxy server by refusing to read or write data on its connection. For instance, if one client suddenly stops reading from the socket to the proxy, other clients should not notice interruptions of service through the proxy. You will need asynchronous behavior, described in "Using TCP Through Sockets".
The proxy must also handle hung clients and servers. In particular, if one end keeps transmitting data but the the other stops reading, the proxy must not buffer an unlimited amount of data. Once the amount of buffered data in a given direction reaches some high water mark (e.g., 8K), the proxy must stop reading in that direction until the buffer drains.
The proxy must handle end-of-file conditions as transparently as possible. If it reads end-of-file from one socket, it should pass the condition along to the other socket (using shutdown) after writing any remaining buffered data. However, the proxy should continue to forward data in the other direction. The proxy should terminate a connection pair and close the file descriptors under either of the following two circumstances:
The proxy will enforce an upper limit on the number of active connections. Once this limit is reached, no new connections are accepted --- upon closing a connection, a pending connection (if any) is accepted.
prompt>
denotes your own machine, while #
denotes the EC2 instance). You may use any libevent supported image, including Ubuntu, Fedora, and other similar images. You can find free AMIs
here and
here (login instructions are still the same, although some details might be slightly different. For example, Ubuntu uses "ubuntu" as the default username). You may, however, have to manually install the libevent package (sudo apt-get build-essentials libevent-dev on apt-get, or
sudo yum install "Development Tools" libevent-dev on yum). When you run the instance please don't forget to mark your usage in the shared document from lab 0 (now there is a column for lab 1). You should be able to get
the files from your local box to Amazon and build like this: prompt> scp -i ~/.aws/id-rsa-kp-el378-lab1 tcp-proxy.tar.gz ec2-user@ec2-75-101-184-233.compute-1.amazonaws.com:~/ prompt> ssh -i ~/.aws/id-rsa-kp-el378-lab1 ec2-user@ec2-75-101-184-233.compute-1.amazonaws.com # tar --no-same-owner -xzf tcp-proxy.tar.gz # cd tcp-proxy # make gcc tcp-proxy.c -levent -o tcp-proxyIf you work as a root not ec2-user, it is very important you use the
--no-same-owner
flag for tar, since you are acting as root
on the EC2 machine, and otherwise tar will change permissions (sadly propagating up the entire /root/tcp-proxy path) to the user ID of your local machine (the machine
the scp command was initiated from), which does not necessarily exist on the EC2 instance (unless you fancy being
root
on your box). As a result, subsequent SSH connections will fail, since the
/root
home folder of user root
has just been owned by a rogue user ID.
# cd ~/tcp-proxy # make dist prompt> scp -i ~/.aws/id-rsa-kp-el378-lab1 ec2-user@ec2-75-101-184-233.compute-1.amazonaws.com:~/tcp-proxy/tcp-proxy.tar.gz .
tcp-proxy
. To test it, type, for example:
# ./tcp-proxy www.yahoo.com 80 1234
Now you should test your program using telnet which is not installed in the instance. You may open a new window and install it using the following command. If you are logged in as a root user you can omit sudo.
# sudo yum install telnet
You may install any other packages that you think you need for this lab. # telnet localhost 1234
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Connection closed by foreign host.
#
The message "Connected to localhost" says that your proxy accepted a TCP connection, but then immediately closed it, since the proxy is not fully implemented. Your must finish implementing the proxy. You are free to use any basic C/C++ library (like
STL for example), or you can design your own data structures. It is possible to complete the assignment without using any other external libraries or data structures.
You should test your proxy to make sure that it continues to forward data even when some connections aren't responding. Here's one test you should be able to pass.
First, run the proxy and point it at fireless.cs.cornell.edu's HTTP port:
# ./tcp-proxy fireless.cs.cornell.edu 80 1234
Now, in another window, use telnet to fetch /big through the proxy:
# telnet localhost 1234 Trying ::1... telnet: connect to address ::1: Connection refused Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET /courses/2013fa/cs6410/labs/bigWatch the data go by for a while, then interrupt the output by typing control-], after which telnet should stop and print telnet>. Now check that the proxy hasn't been hung because telnet isn't reading data; suspend your telnet by typing ``
z
RETURN'' and fetch something else:
telnet> z Suspended # telnet localhost 1234 Trying ::1... telnet: connect to address ::1: Connection refused Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET /courses/2013fa/cs6410/labs/small You got it! Connection closed by foreign host.If you see "You got it!," your program passes the test.
Now try to access the restricted page from your web browser with a URL like
http://ec2-75-101-184-233.compute-1.amazonaws.com:1234/courses/2013fa/cs6410/restricted/. Again, make sure
ec2-75-101-184-233.compute-1.amazonaws.com
is your EC2 machine running your tcp-proxy, and that you have
authorized access on port 1234 to it.
Next, lower the maximum number of allowed concurrent proxied connections to something like 2, and test by pointing your proxy to fireless.cs.cornell.edu, port 80, just like in the first test. Start by opening 3 telnet connections, but without issuing the
HTTP GET. The third connection should not be accepted. Now issue GET /small
in one of your two connected telnet prompts, and once you received the responce from the server your third connection should be accepted --- HTTP web servers orderly terminate
the connection to indicating that the end of file was reached.
# make dist
rm -fr .DS_Store *.tar.gz *.ps *.pdf *.o *.dSYM *~ tcp-proxy test-tcpproxy
tar -czf tcp-proxy.tar.gz ../ --exclude=tcp-proxy.tar.gz --exclude=".svn"
tar: Removing leading `..' from member names
tar: Removing leading `../' from member names
#
To turn in your distribution, upload the tcp-proxy.tar.gz
file on CMS.
If you have any problems about submission, please contact the TA.
ulimit
command before you will see the core files being created (e.g.
ulimit -c unlimited
). You can examine the core files with gdb
in order to learn what went wrong --- this is an invaluable tool. You can start by typing
gdb program program.core
, and then typing the gdb command bt
or
backtrace
. GDB will in turn return a trace pointing to where your program has crashed.
This page was originally created by Tudor Marian.