Programming Assignment 6: Distributed Image Renderer
Overview For this assignment, you will write a distributed rendering system. Of course, writing a complete rendering system is well beyond the scope of this course, so for this assignment, specialized rendering servers will be provided. Installed on each of the CSUG lab machines is a rendering server program. Each server program listens on a dedicated network port for incoming instructions. Your job for this assignment is to write a client program that connects to several rendering servers and issues appropriate instructions to each of the rendering servers to complete a demonstration image.
System Architecture As an overview of your design, your client will have three threads: a work-manager thread, a communications thread, and an image-assembler thread. These threads will communicate through a number of thread-safe queues.

The work manager is the main control thread for the client. It is responsible for generating work requests for the servers, and placing these requests in a send queue. It receives as feedback completed work units in the receive queue. As it receives completed work units, it places these results on the results queue for the image assembler to process.
The communications thread is responsible for establishing communication with all the rendering servers and handling lower-level network I/O to and from the connected servers. It takes requests off the send queue and sends them over the network to the appropriate servers. At the same time, it receives responses from the servers and places them on the receive queue for the work manager to process.
The image assembler has been provided to you and is responsible for taking completed work off the results queue and assembling them into an image.
The simplest load-balancing approach is to have the work manager enqueue all requests at the beginning of the program. You will find that this will not be a good approach, as many servers will finish early, while a few will be left working alone on the harder parts of the image. Instead, you might consider using the completed work units as a feedback mechanism for deciding how to distribute the workload more evenly across the servers.
Network Protocol This is described in detail at the bottom of these instructions, but basically the server understands five commands:
- Hello Resets the server and registers your client for further communications.
- Start Thread Starts a rendering thread on the server. This thread will be dedicated to your client.
- Send Work Sends a list of work units to a specific rendering thread on the server.
- Kill Thread Kills a specific rendering thread on the server.
- Kill All Threads Kills all active rendering threads on the server.
The server will return to your client two types of responses:
- Work response Sent when the server has finished rendering some pixels.
- Error response Sent when the server has detected an invalid command. The server will then close the network connection to your client.
Work Units The server divides the image evenly into work units. Each work unit is a contiguous square block of pixels. A work unit of size n has n2 pixels. A work unit of size four, for example, would be a square 16-pixel block, four pixels on a side. With a work-unit size of 32, a 128x128-pixel image would have 16 such work units.
0 | 1 | 2 | 3 |
4 | 5 | 6 | 7 |
8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 |
Work units are numbered row by row with their IDs, starting with work-unit 0 at the upper-left corner of the image. As an example, the diagram above shows the work-unit ID assignment in an image with 16 work units.
When the client sends work to a server, it sends a list of work-unit IDs to the server, packed in an array. The server computes the appropriate pixel values for each work unit and returns a large array containing each work unit ID and the computed pixel values. The job of your client will be to create requests to render all the work units in the image and gather all the completed requests. The code we provide includes methods for assembling a complete image from work units.
Client Skeleton
As a starting point, we have provided you with a code skeleton for your client. In it, you will find several C source files, a sample workers.conf, a makemake script, a Makefile.am, and six other text files. You can ignore these extra text files, but do not delete them. They are needed for the GNU build system mentioned below.
While you may edit any part of the skeleton as you see fit, it is highly suggested that you adhere to it as much as possible. At the very least, your client must support the same command-line options as the skeleton, and you must implement the architecture described above.
Obtaining the code The code is available on the CSUG machines under /usr/local/cs316/pa6. To copy it into your home directory on the CSUG machines, type:
cp -R /usr/local/cs316/pa6 ~If you prefer to use your own machine, you can scp it from the CSUG machines into your home directory:
scp -r netid@csug01.csuglab.cornell.edu:/usr/local/cs316/pa6 ~
We will not provide support for developing on non-CSUG machines. You are responsible for ensuring that your project will compile and run correctly on CSUG.
Compiling the project This project uses the GNU configure and build system to automatically detect dependencies between the source files and generate an appropriate Makefile. The details of how this works are unimportant; simply run the makemake script to create a Makefile. Once you have a Makefile, you can use make as usual to compile your project.
If you create any new .c or .h files you should edit the "drt_SOURCES" line in Makefile.am to include these new files and re-run makemake. You should also re-run makemake if you change the dependencies between your source files (i.e., add or remove any #includes that reference other files in your project). If you fail to do any of this, your project will still compile, but you will have to "make clean" more often, as make's incremental build functionality will not know about your new files/dependencies.
Running the client Your client will be compiled into a binary called drt. Running drt with no arguments will start the client with the default configuration. The client supports several configuration options for adjusting the image resolution, the work-unit size, and the image quality. To see what options are accepted by the client, run drt -h.
Running the client skeleton will generate an image containing the SMPTE color bars. As the image assembler receives completed work in the results queue, sections of this image will be replaced with the rendered image.
Worker host configuration (workers.conf) Your group will be assigned a unique clientID to use to identify yourselves to the servers. (See the HELLO message specification below.) The client reads a worker host configuration file called workers.conf to learn its clientID as well as information about each server available to the client: hostname, port number, and maximum number of threads allowed. This file is formatted as follows:
<clientID> <number of servers> <hostname1> <port1> <max threads 1> <hostname2> <port2> <max threads 2> ...
It is your responsibility to keep your workers.conf in sync with the server configuration posted here. We will send out e-mail when we update the server configuration.
Code layout The code should be fairly well-commented. While you may edit anything in the project, you should only have to edit the following six files:
- queue.h and queue.c
- messages.h and messages.c
- comm.c
- manager.c
-
main.c
This contains the main() function as well as code for parsing the command-line options.
-
opts.h
This contains the global options variable that stores the client's configuration parameters, parsed from both the command line and workers.conf.
-
manager.c
This contains the manager_main() function that will be called to enter into the manager's main loop.
-
comm_helper.h
This declares several functions that should prove useful when implementing your communications thread.
It also defines a global sigmask variable. Use this as the last argument to pselect.
-
comm.c
This sets up the send and receive queues and defines the functions the work manager will use to interact with the communications thread.
It also contains code for setting up the process-signal masks and handlers, as well as sending the communications thread a process signal when a request is enqueued. This process signal will unblock the communications thread's pselect call, resulting in pselect returning -1 with an EINTR error code. (See pselect's man page for details.) Note that process signals are different from condition variables.
It is highly recommended that you leave the existing code as is and simply start your communications thread at the end of comm_init().
-
messages.c and messages.h
messages.h defines skeletons of structs for each message type. messages.c defines several functions you will find useful for converting between network-endian numbers and host-endian numbers.
-
queue.c and queue.h
queue.h contains the interface to the thread-safe queue implementation and a skeleton for the queue struct. queue.c contains skeleton functions for implementing the queue interface.
-
assem.h
This contains the interface to the image assembler. Note that the internals of the image assembler uses the thread-safe queue that you will be implementing.
Debugging Tools
netcat netcat is a tool that essentially hooks up your console directly to the network: anything read from the network is printed to stdout; anything provided as input on stdin is sent over the network. It is installed on the CSUG machines as nc. You can run it in one of three modes: client, server, and tunnelling.
-
Client mode: to connect to www.google.com:80, type
nc -x www.google.com 1234
-
Server mode: to listen for connections on port 1234, type
nc -x -l -p 1234
-
Tunnelling mode. In this mode, netcat acts as an intermediary
between a client and a server. It sends everything it receives from
one host to the other, printing to stdout everything it
hears. In this mode, netcat is acting as a tunnelling
proxy. To listen for connections on port 1234 and tunnel to
www.google.com:80, type
nc -x -L www.google.com:80 -p 1234
Valgrind Valgrind is a tool for memory debugging and memory leak detection. It detects and reports memory access violations as they occur when a program is running. To use Valgrind, you need to compile your program unoptimized and with debugging symbols. To do this, edit the Makefile that's generated by makemake and change the CFLAGS variable to read:
CFLAGS = -g -O0 ← that's an "oh-zero"Re-compile with "make clean all" and run your program as usual, except add "valgrind" before the program name:
valgrind drt <options to drt>If you'd like detailed reporting on memory leaks, run:
valgrind --leak-check=full drt <options to drt>
How to Get Started
This project will be larger than the others and deal with lower level systems programming than you are likely to be already familiar with. To get started, it's best to have a plan of attack.
- Tackle threading before networking. To start, you should implement the thread-safe queue. There is a skeleton in the code provided. The queue is the basis of the client's inter-thread communication and debugging the queue once the rest of the client is written will be difficult. Design your queue and test it as a separate unit first. This will also familiarize you with blocking, locking and threads.
-
Talk to netcat before talking to the servers. Write the basic outline for the communications loop, but leave out the receiving part. Write the code to establish a connection and to send data. Focus first on sending well-formed messages and debug this with netcat before attempting to talk to the servers and reading responses.
You don't have to, but a good suggestion is to write a short work manager that hard codes the messages to send a single work request to a single server and then waits to receive that response. Once you have this working, you will be assured that your basic network I/O is working.
- Leave the work manager for last. The work manager should be easy once everything else is working.
What to Submit
Submit all of the code for your client, in a zip file, to CMS by the due date. In the final exam period, we will run your code to produce an image. For this test, we will use the entire rendering cluster. You should expect your client to complete an image in less than 10 minutes. During this process you will discuss your code's design and explain the details of your implementation.
Help and Hints
Ask the TAs and consultants for help. You can contact us through the course staff mailing list or the class newsgroup. We expect to see most students in office hours during the course of the project. Extra hours will be scheduled as needed.
If you suspect a bug in the servers or in the client framework, ask the course staff for help.