CS 3410 Programming Assignment 4 FAQ

Instructor: Kavita Bala

PA3: Distributed Renderer

Due: Monday, December 15th, 2008, 5:00 pm

 


December 12th, 9:45AM:  In the analysis, the required image sizes were accidentally omitted for all parts save the first.  The assignment is now corrected.  Use 128x128 for parts 2 and 3 and 512x512 for parts 1, 4, and 5. Also, for parts 4 and 5, use 64 threads and an initial block size of 16x16.
December 10th, 12:00AM:  There have been several questions posed to the course staff recently:

Q. Does render (x1,y1)-(x2,y2) include (x2,y2) or exclude (x2,y2)?
A.  It includes (x2,y2). The PA4 description is correct. Blame Saikat for the deceiving comment in the source and for what he said in section.

Q. Can the stack be initialized with all the space it will ever need pre-allocated?
A.  No. Your stack should grow and shrink dynamically as elements are added and removed. You may, however, choose to build your stack out of inidividual elements that are allocated and deallocated, or arrays that are resized dynamically using realloc.

Q. Should my process be consuming 100% CPU?
A.  No. If you do it right, your process shouldn't consume much CPU at all. Remember, polling your stack in a tight loop is very inefficient; in fact, polling in a tight loop may prevent other threads from doing useful work on the CPU and result in your process exceeding the allotted time. You should have your threads wait for work, and have your threads interrupt each other when more work is made available in the stack. Think condition variables.

Q. I had a bad nightmare about pthread_mutex_lock and pthread_cond_wait. HALP!
A.  We noticed a couple of you are trying to be very conservative with locks (in your worker.c methods) by locking at a very fine granularity. This much concurrency will introduce a whole host of race conditions and nightmares that you'll need to reason about. A far better option is to have giant critical sections (and I am talking whole functions including looking for work or finding someone to cancel here), and relinquishing the lock on the critical section only before blocking calls (e.g. to render). Don't worry about efficiency. By Amdahl's law, render takes sooo long that no matter how much you optimize your code, it will not run any faster. In fact, as Knuth famously said, "premature optimization is the root of all evil." So basically, get rid of all races first before thinking about optimizations.

Q. My program hangs after computing all pixels?
A.  The main method dumps the image to disk after do_work returns. Thus, your do_work method must not kill the main thread (by calling pthread_exit). And it must not return back to main before the whole image has been rendered. Alternatively, if some of your threads are waiting on a condition variable for more work, remember to tell them the party is over before the last working thread exits.

Q. What is out_pixels in the call to render and who allocates it?
A.  It is something you must allocate to temporarily store the rendered pixels. You will then call image_set_pixel for each pixel rendered.

Q. What does "for each thread and each block size" mean?
A.  You should store the average block render time for each blocksize aggregated across all threads. The average should not be local to threads, but rather should be computed globally.

Page maintained by Kavita Bala