CS 414

Homework 3 -- Locking granularities

Assigned: October 17, 2002

Due: Wednesday, October 30, NOON

Assignment outline:

For this assignment you will be experimenting with different granularities of locking on a data structure that is shared among many threads.

The data structure you will be experimenting with is a simple n x n array of numbers. The invariant is that the sum of all entrees in the grid remain the same.

n will range from 1 to 10.

You will have many threads executing in your program.

The life cycle of each thread is:
	for i = 1 to NO_SWAPS
		cell1 = random cell
		cell2 = random cell
		swap(cell1, cell2)
	end for

In order to allow more than 1 thread to execute this code concurrently on the data structure, there must be some locking to prevent corruption of data.

Since this is a gridded data structure, it is natural to think of four different granularities of locking:

GRID level - obtain a lock for the entire grid before doing the swap. Only one thread can access the grid at a time.
ROW level - obtain a row level lock for both of the cells the thread is going to swap. Only one thread can access any single row at a time, but this allows more threads to access the grid simultaneously.
CELL level - obtain a cell level lock for both of the cells the thread is going to swap. Only one thread can access any single cell at a time. This is the finest level of granularity.
NONE - for comparison. Expect corruption.

Program usage:

gridapp gridsize numthreads locking_granularity

Examples:

gridapp 8 3 -cell

- executes gridapp with an 8 x 8 array, 3 threads running, with cell level locking

gridapp 10 100 -grid

- 10 x 10 array, 100 threads running, grid level locking

gridapp 10 100 -row

same as above, row level locking

gridapp 10 100 -none

same as above, NO LOCKING *** this is for testing, and data *should* be corrupted at the end.

Program output:

Print initial grid

Print sum of all cells in initial grid

Execute life cycle of all the threads

Print final grid

Print sum of final grid

Print elapsed time in seconds

The initial and final sums should match if there is no data corruption (i.e. the synchronization code is correct).

Here are the steps for your assignment:

First examine the the code for example_thread.c
Note how threads are created and mutex and condition variables are used for synchronization.
Compile and execute this code, examine results:
gcc -o example_thread example_thread.c -lpthread
./example_thread
Next, look at the skeleton code we have provided for gridapp.c
Note that grid initialization, printing, and summing functions are provided.
We have also provided the thread starting code and the thread's function body.
Note the "sleep(1)" hidden in the middle of the swap operation. You will investigate this later in the assignment.
We have provided everything you need except for the locking data structures and locking code. (Don't forget threads_left! )
You should be able to fill in the necessary code after examining example_thread.c (You are not required to use condition variables.)
Fill this code in. NOTE THAT YOU WILL NEED A DEADLOCK PREVENTION SCHEME. You can add code elsewhere if necessary like global variable declarations, etc.
Compile gridapp.c:
gcc -o gridapp gridapp.c -lpthread
First verify that the concurrency is working correctly:
Run ./gridapp 10 10 -cell
./gridapp 10 10 -row
./gridapp 10 10 -grid

All of these should result in the same initial and final sum.
There should also be significant differences in time elapsed. Note these results.
Now, try running without locking
Run ./gridapp 10 10 -none
The initial and final sum should be different.
Also note how the elapsed time compares to the other cases.
Experiment with different grid sizes, thread counts, and granularities of locking.
a.) At grid size 10, graph the time elapsed on the y axis and thread count from 1 to 10 on the x axis. Make this graph for all four types of granularity (-none, -cell, -row, -grid). (Optional: What is the maximum number of threads you can create with pthread_create failing?)
b.) Now, leave the thread count at 10, graph the time elapsed on the y axis and the grid size from 2 to 10 on the x axis. Make this graph for all four types of granularity.
Now, you could run all 76 tests one at a time, and copy and paste the results into MS Excel spreadsheets to graph the results.
Or:
For EXTRA CREDIT:
Write scripts and/or small C programs to automate this process of generating data into CSV files that are easily imported and graphed by Excel or another spreadsheet program.
For MORE EXTRA CREDIT:
When running performance tests, you usually want to execute multiple runs and take the average so as to get more accurate performance figures. So, extend your scripts to perform multiple executions (say 10) of each data point and calculate means and variances of the data, outputting these into the CSV file instead of the raw data.
What can you conclude about how grid size, number of threads, and locking granularity affect performance?
What general results can you conclude from this exercise? When does fine granularity locking have the biggest effect? What costs does it impose?
Note the "sleep(1)" command in the swapping code.
Why do you think it is there? Try commenting it out or moving it to a different location in the block of swapping code. Try moving it to just before the first line of the swapping code but after you have obtained your lock.
What effects does this have on performance and correctness of final sum? Try experimenting with all levels of granularity on a 10x10 grid with 10 threads.
Anything surprising?
What can you say about why this sleep(1) is there and where it is placed?
If we didn't have this sleep(1) there, how would this change the assignment? What does this say about when multithreading and synchronization make sense

What to submit:

A single gzipped tar file containing one directory with the following files:

README
README file should contain names/netIds and a guide to the rest of the files. In particular say very clearly the format of your writeup. Also describe any optional features implemented.

Makefile and (Visual Studio project files if applicable) for compiling

gridapp.c

Writeup and Data
You may submit your Writeup and data as a Micrsoft Word document with analysis and graphs and an Excel file with your data. You could also submit your writeup and data as text files and your graphs as gifs. Basically, you have quite a bit of flexibility, but your README should clearly direct the graders to the relevant files and their purposes. Also any format you use must be easily displayable in the CSUG lab with software standardly installed there.

Discuss your locking and deadlock prevention strategies (I recommend breaking one of the 4 necessay conditions for deadlock not implementing the banker's algorithm!)
8 graphs or maybe 2 graphs with 4 lines each ( I highly recommend drawing what you expect the graphs to look like before you ever generate them. This gives you a chance to sanity check your expectation with reality rather than just jumping right into "explaining" what you see.) Graphs should clearly label axes and quantities being graphed!
Analysis
1. Effect of grid size
2. Effect of # threads
3. Effect of granularity of locking
4. Effect of sleep(1)

Extra Credit -- source files for scripts/C programs that generate automated CSV data

You may do this assignment individually or in groups of two. If you work alone, you should use the pthread interface. If you work in a team, you should write a single program that can be compile to run either on Linux or on Windows. In either case, you should program in C. In Windows, you should use the Win32 API not use pthreads :-) (Less portable I know but the point is to give you a different experience with Windows primitives like CreateThread, WaitForSingleObject, WaitForMultipleObject, etc.)