Assignment 1

DUE 11/2/04, 23:59pm

This assignment is concerned with low-level vision. Note that we recommend you to do this assignment in groups of two. However, you can also do it by yourself if you prefer.

In this assignment you will implement two different energy minimization algorithms for stereo. We will give you images with known ground truth, which will allow you not only to compute the energy of your solution but also to compute its accuracy.

It is important to note that the assignment is somewhat under-specified, in that you have some freedom both in terms of the design of the energy function and in terms of the energy minimization algorithm. However, it is important that you design choices be documented, so that we can understand them, and that they be deterministic so that they are reproducible. Therefore, any functions that make use of a random number generator will need to take a seed value as an input, and for any output that you show you must provide the seed value that generated it.

The test images provided are in pgm format. Your output should follow these same standards. To simplify the grading process, your code must run under Windows2000. However, you will write straightforward C or C++ code, and can therefore do development on any platform that you wish. But the code you turn in must generate the results you show under Windows2000.

What to turn in: a) your source code for all problems b) Windows executables c) a write-up explaining how to run your program and an explanation of your design choices and results. You will submit these using CMS, the course management system.

A. Design of the Potts model Energy function

The energy function that you should use is

where f(p) is the disparity of pixel p, and D(p, f(p)) is the cost of assigning pixel p disparity f(p).

The first part of this assignment is to design a good energy function. In particular, you should make the following choices, and explain your design decisions:

· What is D? A typical choice is the truncated L2 distance, which is min[(I_L(p) – I_R(p+d))^2,K] for some constant K. However, the Birchfield-Tomasi sampling measure, described in lecture, can also be incorporated.

· What is the regularization parameter l? Usual values are around 10.

· What is the neighborhood system N? Typically the 4- or 8-connected neighborhood is used, but larger neighborhoods are also possible.

· What is the form of V? For the Potts model V is defined to be 1 if its arguments differ, otherwise 0. However, you will get better results if you make use of static clues, as described in lecture.

Finally, your method should detect occlusions somehow. You will need to add an additional label that represents a pixel being occluded, and your energy function will need to handle this. You will need to document how you do this, and how D and V behave if their arguments are occluded.

Datasets

Images from the University of Tsukuba:

Left Image- scene_l.pgm
Right Image- scene_r.pgm
Initial disparity map (from a local algorithm) - scene_corr.pgm

The disparity range for this pair is [-15, 0]. Note that only a few of these disparities are actually present in the image.

This pair also has ground truth - t.pgm (the image is actually shifted by (18, 18) pixels and centered).Note that on this image, occlusions are displayed as 255. You can also get a full-size version of the ground truth without occlusions, where the disparities are multiplied by 8, in truedisp.row3.col3.pgm (on this images, pixels where the ground truth is not known are labeled as 0)

B. Energy Minimization

The most important part of this assignment is to actually minimize the energy. You will do this using two algorithms, annealing and graph cuts. For each minimization algorithm, you must include in your write-up all your choices and why you made them. You should plot a graph of the energy as a function of time, and provide images showing the output both halfway through and at the end of the algorithm.

In addition you will need to plot the accuracy (computed using the ground truth). You should ignore pixels that are not given a label by the ground truth, and you should only count errors that are off by more than ±1. However, you should also make use of occlusions: a pixel that your algorithm says is occluded and which the ground truth says is not occluded counts as an error, as does the opposite case. Note that you will need to compute the set of occluded pixels from the ground truth.

B.1 Simulated annealing

Once you have designed your energy function, you should minimize it via simulated annealing. There are many aspects of Annealing and Metropolis which are up to you. For example, you must decide what schedule you are going to use to decrease the temperature. Also you must decide for how long the Metropolis algorithm will run before changing the temperature. You must also choose a initial temperature. Choose these parameters as you wish, but be sure to document your choices.

B.2. Graph cuts

You will also need to minimize the energy using graph cuts. Implement the expansion move algorithm as described in class. You are strongly recommended to make use of Vladimir Kolmogorov’s energy minimization software, http://www.cs.cornell.edu/People/vnk/software/energy-v1.0.src.tar.gz, which makes it simple to implement this algorithm. The only design choice is the order in which expansion moves are selected; be sure to document your choice.