CS6640 Computational Photography

Cornell University

Tuseday & Thursday, 10:10am, Hollister 401

From Courses of Study: A course on the emerging applications of computation in photography. Likely topics include digital photography, unconventional cameras and optics, light field cameras, image processing for photography, techniques for combining multiple images, advanced image editing algorithms, and projector-camera systems. Course work includes implementing several algorithms and a final project.

Schedule

This is a new course, so all future dates and topics are highly tentative and certain to change.

date		topic	assignments
23	Aug	introduction slides slides
Digital Photography
28	Aug	camera basics slides	HW1 out
30	Aug	color science slides
4	Sep	image sensors slides	HW1 due, HW2 out
6	Sep	camera color processing
11	Sep	camera photofinishing pipeline slides
13	Sep	gaussian optics slides	HW2 due
18	Sep	gaussian optics	HW3 out, PR1 out
20	Sep	practical camera optics slides
25	Sep	Ivo Boyadzhiev: Color2Gray David Thomason: Homogeneity-directed Demosaicking	HW3 due
27	Sep	Shuang Zhao: Radiometric Self Calibration Long Wei: Tone Reproduction for Realistic Images
Cameras with modified optics
2	Oct	light field photography slides	PR1 due
4	Oct	multidimensional signal processing slides
9	Oct	—Fall Break—
11	Oct	multidimensional signal processing slides	HW4 out
16	Oct	light field manipulations slides
18	Oct	Scott Wehrwein: Fourier Slice Photography Pramook Khungurn: Coded Aperture Photography	HW4 due, PR2 out
23	Oct	Youyou Yang: Multi-Aperture Photography Daniel Schroeder: Dual Photography
25	Oct	Josh Vekhter: General Linear Cameras Steve Marschner: Dappled Photography
Tools for editing images
30	Oct	gradient domain methods slides
1	Nov	graph cuts and Markov random fields slides
6	Nov	bilateral filter and HDR tone mapping slides	PR2 due
8	Nov	applications Won Jun Jang: Seam Carving
Fixing photographers’ problems
13	Nov	matting and compositing slides
15	Nov	camera shake removal slides
20	Nov	video stabilization slides	final proposal due 11/16
22	Nov	—Thanksgiving break—
27	Nov	Mevlana Gemici: Content Preserving Warps and Subspace Video Stabilization Sean Bell: GrabCut	final milestone due 11/21
29	Nov	Gokhan Arikan: Single Image Deblurring Zhongyang Fu: Flash and No-flash
Final projects
13	Dec	Final project demos

Homeworks

There will be short written homeworks, roughly weekly, to give you practice with the material we cover in lecture. Generally the results of your homework will be reported by posting to the CS6640 wiki, and for most homeworks, we will discuss the results in class on the due date.

Homework 1 Camera response nonlinearity

Do this project in groups of two. You can use Piazza to help pair yourselves up.

Cameras produce pixel values that are related to the intensity of light in the scene they are looking at. You can learn about the function that maps scene intensity to pixel value by taking an exposure series— a series of images looking at a static scene where the exposure varies. Since the amount of light getting to the sensor is proportional to the exposure, this gives us (up to some normalization factor) information about the pixel values that will be caused by different amounts of light.

To do this homework you need a camera that lets you control the exposure. Depending on software this might even be your cell phone, but most often it is probably a standalone camera of one sort or another. You can always use one of the course's cameras for this if you like.

The first part of the assignment is to take an exposure series. Choose a nice scene—we will be using these as a shared pool of input data for the next assignment. Still lifes, landscapes, architectural subjects, … anything that stays put (and that can be counted on not to change brightness while you are photographing; watch out for partly-cloudy days where the illumination may change rapidly). Take about 20 or so images, with shutter speeds ranging from way too dark (as in, the whole image looks pretty much black) to way too bright (most of the image completely saturated white). Convert them to grayscale, and resize them to at most 2000 pixels (that is, shrink if needed so that the maximum of width and height is 2000). Put a .zip or .tgz archive of your images, encoded as 8-bit PNGs, labeled with a 600-pixel thumbnail of a reasonable exposure, at the appropriate place in the course's project wiki.
The second part is to estimate the response curve. Choose a small patch of image that has near-constant intensity, and compute the average value of that patch in all the exposures. Make a log-log plot of pixel value vs. exposure. Does it look smooth? If not, be sure your camera settings are staying constant, and that you have the exposure times right. Fit something to the points to define a smooth curve (a good option might be a low-order polynomial, but use whatever you see fit) and make a clear and well labeled plot showing the data points and the curve. Post the plot to the wiki, next to your images, along with a precise specification of your fitted response function, and explain the relevant details about the capture (i.e. what kind of camera, and what relevant settings you used).

Homework 2 High dynamic range imaging

Do this project in groups of two. You can use Piazza to help pair yourselves up.

For this homework, we'll use the data from the previous homework to build high dynamic range images—images that contain an unusually wide range of intensities. The idea is to treat the pixels in an exposure series as a large set of observations of an underlying image; for a given pixel we have several measurements, each contaminated by noise and possibly saturation, and we can use all this information to derive a high-quality estimate of the true intensity at that pixel.

The input to this process is an exposure series—a set of low dynamic range (LDR) images. The term “dynamic range,” which can be applied to any system that handles some kind of signal (microphones and audio amplifiers as well as cameras and displays), simply means the ratio between the highest signal that can be measured and the lowest signal that can can be meaningfully distinguished from zero. In the LDR image that comes from a single exposure in the camera, the dynamic range is not too large; maybe 100:1 or 1000:1. (Fundamentally this ratio has to do with the ratio of full well capacity to dark current and readout noise in the sensor of the camera.) LDR images are normally encoded using 8- to 14-bit fixed point numbers that represent numbers in the range 0 to 1. We want to extend the dynamic range by using many different exposures: the bright parts of the scene are measured well by the shortest exposures, and the deepest shadows are measured well by the longest exposures.

The output we are computing is a single high dynamic range (HDR) image, which has a larger dynamic range. For instance, the “piano” scene has its lowest intensities around 0.07 in a 15-second exposure and its highest intensities around 0.95 in a 1/4000 second exposure. If we normalize these to a one-second exposure, these intensities (if we could measure them) are in the ratio (4000*0.95) : (0.07/15), or about 80,000:1. HDR images are normally encoded using some kind of floating-point numbers, and their values take on a range that is in principle unbounded.

Once we have built an HDR image, displaying it on the screen requires extracting an LDR image somehow, since the screen again has a fixed and limited dynamic range. (For a screen the dynamic range is called “contrast ratio,” and 1000:1 would be a high-quality display). This process is called tone mapping, and the simplest tone mapping model is to use the log-linear equation: \[a = k I^\gamma\] where \(I\) is the scene intensity (the number stored in the HDR image), \(a\) is the pixel value (between 0 and 1), \(k\) is the display exposure, so called because its effect on the displayed image is much like the effect of camera exposure settings on a captured image, and the exponent \(\gamma\) determines the contrast of the image. (Some people would call this linear tone mapping, but it's noe linear, it's a power law, so I like to call it log-linear.) Because of display nonlinearities that are codified in color space standards, values of \(\gamma\) around \(1/2\) produce contrast ratios on the display that are similar to the contrast ratios in the original scene. However, you are free to adjust \(\gamma\) to a larger value to get snazzy high-contrast images or to a lower value to fit a wider range of intensities onto the screen.

The first step is to transform the input images into linear measurements. To do this, you will have to invert the response function that you fit in the previous assignment. So that others don't have to worry about the analytic form of your particular function, generate a 256-entry table that contains the real-world intensity corresponding to each of the pixel values from 0 to 255 in your exposure sequences.
You can check the results by comparing two linearized images: select, say, 10,000 pixels and make a scatter plot of the linearized intensity in exposures \(t_1\) against the linearized intensity in exposure \(t_2\) multiplied by \((t_1/t_2)\) (where this ratio is maybe between 2 and 5). Since these are two measurements of the same number, the result should fit a line of slope 1, with no discernible systematic structure. From this experiment you can determine what pixel values should be considered to be saturated.
Post the result on the wiki with your response function, as a text file with 256 decimal numbers separated by whitespace. Indicate pixel values that are saturated by the special value −1. This part is due on Tuesday 11 September.
The second step is a brief paper-and-pencil exercise. Let the exposure times in the exposure series be \(t_1 \ldots t_n\), and call the linearized intensities \(y_1 \ldots y_n\). Provided the measurements are accurate, \(y_k/t_k\) is an estimate of the true pixel value, with some amount of noise. Assuming an additive noise model that looks like this: \[y_k = t_k y + n_g + n_p\]
where the true intensity \(y\) is scaled up by the exposure time and then contaminated by both gaussian read noise \(n_g\), which has signal-independent variance \(\sigma_g^2\), and gaussian shot noise \(n_p\), which has signal-dependent variance \(\sigma_p^2 t_k y\) (approximating the true Poisson noise; see lecture), what is the right formula to compute a maximum likelihood estimate of the true value from the observations? (See this page on weighted averaging.) Your answer will depend on the ratio of \(n_g\) and \(n_p\), and will have a surprisingly simple form when either \(\sigma_p=0\) or \(\sigma_g=0\). One question to consider is how to get a reasonable value of \(y\) to use in modeling the shot noise; feel free to make approximations that seem reasonable to you, and we'll discuss the various choices after the homework is handed in. If you find yourself facing equations that don't appear to have a simple analytic solution, you might consider approximating the value of \(y\). Hand this part in as a PDF in CMS.
The third step is to actually make HDR images. Write code that uses the formula you derived to make HDR images. Compute HDR images for all the class's exposure sequences, and write the results to files in the OpenEXR format. Compute the ratio of the 99th to the 1st percentile of pixel values in each image to get an idea of the dynamic range.
The last step is to apply simple tone mapping for display. Using a log-linear tone mapping curve, map each HDR image into an LDR image; set contrast appropriately for each example so that you can see detail in both dark and bright regions. You may want to experiment with introducing a shoulder in the response curve; we'll come back to that in the next assignment. Tone-map all your HDR images and post the tonemapped images to the wiki as high quality JPEGs, with information about the dynamic range of each image and the exponent you used to tone-map it.

Homework 3 Gaussian optics

Do this project alone or in groups of two, as you prefer.

In this homework you will work with actual lens construction data for two camera lenses. The first lens is the Leica SUMMILUX-M 35mm f:1.4 (a normal-to-wide lens for 35mm rangefinder cameras), described in this patent, this data file, and this marketing material (the patent differs slightly from the current product, with the aspherical surfaces on different elements). The second lens is the Nikon MF-Nikkor 24mm f:2.8 (a wide-angle lens for SLR cameras), described in this patent and this data file; more interesting information about this lens (and many others) can be found here. As are all wide-angle lenses for SLRs, this second lens is a retrofocus design, meaning its rear principal point is well outside the lens, to allow clearance for the mirror box between the lens and the sensor.

For each lens, compute the 2x2 matrix describing its action on rays from a reference plane at the front vertex of the lens to a reference plane at the back vertex.
For each lens, compute the effective focal length, the back and front focal lengths, and locate the principal planes relative to the back focal plane.
With each lens focused at infinity, what is the physical distance from the rear element of the lens to the sensor plane? How about when it is focused at one meter from the sensor plane?

For the next few parts, consider a scene with an array of 5 bright points all at the same distance from the camera. If I draw a picture of the set of rays that are lit up, using the \((q,p)\) coordinates referenced to the plane of the points, I get something like the image at right. It shows with arrows that each bright point lights up all directions for a given position.

Draw a similar illustration showing what set of rays is lit up at the first principal plane, the second principal plane (assuming the aperture is in the principal planes and simply removes all rays with \(p\) larger than a given value), at the image plane when the camera is focused on the points, at the image plane when the camera is focused a bit closer than the points, and at the image plane when the camera is focused a bit behind the points.
Repeat the previous question, but this time with the points positioned in a line that is not perpendicular to the optical axis. Draw your illustrations for reference planes at the distance of the center point, at the second principal plane, and a the image plane with the camera focused on the closest point, then the middle point, then the farthest point.

These illustrations do not need to be to scale for any particular lens; just be sure things slope the right way and are vertical or horizontal at the right places.

Due by 11:59pm on 25 September.

Homework 4 Sampling theory

Do this project alone or in groups of two, as you prefer.

This homework is about resampling an image that has been blurred by various processes, and the interpretation of what is going on in the frequency domain.

You'll need mathematical software capable of doing discrete Fourier transforms (probably using the FFT) and simple image manipulations. I used Matlab but there are many other environments or libraries you could use.

Suppose we have a camera in which the combination of lens, antialiasing filter, and pixel area produces a band-limited image at the sensor plane, where there is a square grid of pixels (assume grayscale for this homework). If we take a photo with this camera, the image is well sampled and aliasing is not introduced. If we subsample this image by a factor of 4 in \(x\) and \(y\) (just by dropping pixels, without any attention to antialiasing) and then reconstruct back to the full resolution, we will find we've introduced aliasing. Sketch out what is going on in the frequency domain when this happens.

Back up your sketch with an experiment: using this image as your test data, subsample by a factor of 4 in each dimension and look at the effect on the discrete Fourier transform of the image. (To keep the units consistent between your full-res and subsampled images, subsample by zeroing out the pixels you are dropping, rather than moving the subsampled pixels to a smaller array, and multiply the remaining values by 16 to keep the overall energy the same. This is a discrete approximation of multiplying by an impulse grid.) Then reconstruct by convolving with this reconstruction filter, which is a reconstruction filter with reasonable properties sampled at 4 points per input sample. (Construct your 2D filter as a separable product).
Now suppose the camera was out of focus when the image was taken. It has a circular aperture (and let's further assume we are photographing a planar object) so that a single point would be imaged as a constant disk of radius 4 pixels. Now what happens when we subsample in the same way as in the previous question? Explain in words and math how the signal and its Fourier transform change from the in-focus case, and make an updated frequency domain sketch of what happens during subsampling and reconstruction. Also extend the computational experiment to confirm the sketch, using this data to save yourself the trouble of rasterizing a disk.

You should be able to reconstruct an image barely distinguishable from the blurred image, without introducing artifacts.

Extra credit: Can you manage the same quality with a bit fewer samples by using a hexagonal grid?

Now suppose the image is blurry for a different reason. It is in focus, but it was taken out the window of a moving train (we'll continue to assume the scene is planar—maybe this is a billboard at the train station?), and the image shifted by 8 pixels during the exposure. Again explain and sketch the effect on the Fourier transform of the image, and confirm with an experiment (here is a centered line). You can safely sample this image using one fourth the samples—where should we place them? Sketch and confirm by subsampling and reconstructing and showing the effects in the frequency domain.
Now (I know, it's getting to be a stretch) further suppose the motion-blurred image was taken with the camera tilted 28 degrees from the horizontal, still moving at the same speed. This produces a diagonally oriented blur. I claim you can still represent this image with one-fourth the samples, on an axis-aligned grid. You should find that simply repeating the same process we have been using will result in noticeable aliasing artifacts. What can we change to enable successful reconstruction, without tilting the sampling grid to align with the motion direction?

As usual, back up your claims with frequency domain sketches and an experiment. Here is a tilted line for you.

Projects

There will be 3 or 4 projects during the first part of the semester; the tentative topic plans are:

Digital camera photofinishing pipeline
Light field camera image processing
Gradient-domain image editing
Video stabilization

Each project will lead into a brief peer evaluation and a discussion of the results achieved by the class and what make the best images good.

You will also design your own final project that you'll present at the end of the course.

For all the projects, we have a small stable of cameras available for loan, including some DSLRs with various lenses, and light field cameras. You can also use your own equipment as well.

For all projects please include a brief writeup on the wiki page you create for the project that discusses:

Which parts of the project were the most challenging.
Interesting design decisions that you made (simplicity/efficiency tradeoffs; selection of implementation languages, libraries, etc; choice of algorithms, …)
Noteworthy issues that you ran into while designing, implementing, or testing your submission.
Known problems with your implementation, such as missing functionality or bugs.
How you verified the correctness of your implementation.

I also welcome any comments or criticism to improve the projects in future years, which might be better delivered by email.

Project 1 Digital camera raw conversion

Do this project in groups of two. You can use Piazza to help pair yourselves up.

For this project, you'll build a complete basic pipeline for processing raw camera images. It will include color balancing, demosaicking, color correction, and tone mapping, using simple methods for all four stages.

Part 1 of the project is to supply test images. Each group should take two images with a camera of your choice, in raw mode (if you don't have a camera that can provide raw images, you can borrow one of the class's EOS cameras). Make images that look nice and also explore some of the following extremes:

high key (e.g. white dishes on a white tablecloth)
low key (e.g. black cat in a coal mine)
high contrast (e.g. people in harsh sunlight, or buildings against bright sky)
low contrast (e.g. scenes in misty morning weather, with no very dark or very bright areas)
fine-scale natural detail (e.g. tree branches against sky with features just a few pixels across)
fine-scale repeating structures (e.g. venetian blinds, high-contrast woven cloth, black-on-white text, with repeating structures just a few pixels across)
highly saturated color (e.g. crayons or paints)
non-gray average color (e.g. person on colored background)
noisy image (taken at high ISO)
overexposed
underexposed

For each test image, also include a known-neutral object: a gray card if you have one, or a piece of white paper if you don't.

Next, bring your camera to class and photograph a ColorChecker color chart in raw mode, and include this image with your test images. Also measure the raw RGB values corresponding to the 24 squares of the chart and put them into a space-delimited text file with 3 numbers per line (in English reading order ending with black).

Convert your raw images to the DNG (“digital negative”) format. Your camera may already store raw files in this format, in which case you are done; most do not, but you can use the free Adobe DNG Converter to convert them. When you do this, navigate through the UI to Preferences, Compatibility, Custom…, and select “Uncompressed.” This will make it easier for people to read your images.

Finally, post the DNGs to the course wiki (using Confulence to store large files), along with the ColorChecker measurements. Part 1 is due by 27 September.

Part 2 of the project is to put together the basic pipeline. If you work in Matlab, see this post about how to read DNGs (which are just special TIFF files); if you implement your own reader see the DNG specification, and here is a copy of the TIFF/EP specification (readable to class only) for your reference.

Linearization. Read the raw data from the DNG file, apply the linearization curve stored in the file, if there is one, and subtract the black level, if there is one. (In the absence of a linearization curve, the data should be linear, and in the absence of black level information, the data should already be corrected to a black level of zero.)
Color balancing. Color balance the image using the white or gray reference in the image.
Demosaicking. Demosaic the image using the “edge based” method presented in class. Yes, it will produce some artifacts.
Color corrrection. Derive a color matrix for each camera based on the ColorChecker RGB values. Color balance them using the gray patches, and fit a 3x3 transformation that approximately maps them to the known XYZ values subject to the constraint that [1,1,1] maps to [1,1,1]. It might be interesting to compare your result to the matrices stored in the DNG file.
Tone mapping. Map the final image from the linear space in which you computed it to a tonemapped image in the sRGB color space. Use the Reinhard tone mapping function: \[L_d = L \frac{1 + L/L_\text{white}^2}{1 + L}\\ L = \frac{a}{\bar{L_w}} L_w\] where \(L\) is normalized luminance, \(L_w\) is scene luminance (that is, the linear pixel value), \(\bar{L_w}\) is the log-average scene luminance, and \(L_d\) is display luminance (what you store in the output file). There are two parameters to this curve: the key \(a\) of the image (lower than 0.18 for low-key images, higher for high-key images) and the white level \(L_\text{white}\), which is the normalized luminance that will map to maximum display luminance. Be sure to correctly encode into the sRGB color space.

To demonstrate your success in Part 2, process all the class's images and post 600-pixel thumbnails, with links to full-res images, to the wiki. (It's OK to skip a few of the images if the files have some kind of characteristics that prevent your processing them, but do include any just plain bad results, as these are at least as interesting as the good results.) Choose the tone mapping parameters according to your judgment for each image.

Part 3 of the project is optional: extend one of the stages a bit. Some examples:

Implement a smarter demosaicking algorithm.
Implement an automatic white-balance algorithm.
Create some knobs for tone mapping, color processing, and sharpening giving the user the ability to push for more vivid, higher contrast results.
Implement some fun Instagram-like post-processing filters.

If you do extend your converter, post examples next to the standard-pipeline results from Part 2.

Project 2 Light field photography

Do this project in groups of two. You can use Piazza to help pair yourselves up.

In this project you will work with data from the Lytro light field camera. This camera is a microlens array based light field camera, with the microlenses focused on the main lens aperture (“Plenoptic 1.0” architecture). Your mission is to write software that can take the raw sensor data from this camera and process it to create images focused at different distances and with cameras located in different positions from the actual camera.

The following steps will help you develop a piece of software that does this. I found it easier to work in C++ than Matlab for this project, because the resampling calculations are easier to express with code in the innermost loop than as matrix operations. I used the OpenCV library to read and display images, but again, many options are equally good for this.

To get at the data, you can use the lfpsplitter tool on the .lfp files you will find sitting around after downloading images using Lytro's software. (On the Mac they end up in ~/Pictures/Lytro.lytrolib/; I'm sure you can find them in the Windows version too.) This will produce assorted metadata along with a raw dump of 16-bit pixels which you can demosaic using your code from Project 1 or any other tool you like (I used the demosaic function in the Matlab image processing toolbox).

For this assignment I will use the convention that \(u\) and \(v\) parameterize the plane nearer to the camera and \(s\) and \(t\) parameterize the plane nearer to the scene.

The Lytro camera has a hexagonal array of microlenses, which doesn't introduce any fundamental difficulties but does make the resampling code a bit more complex. One way to deal with a hex grid is to think of the centers of the microlens images as a 2d lattice spanned by two basis vectors that are approximately \([1,0]^T\) and \([1/2,\sqrt{3}/2]^T\) times the spacing between lenses. With the origin at one of the microlens centers, the centers of the others are all the integer linear combinations of these vectors. Once you have this lattice basis you are a linear transformation away from figuring out which three points are the corners of the triangle surrounding any given point on the \((s,t)\) plane.

Step 0. Borrow one of our Lytro cameras from Randy, take some pictures with it, and select one to submit as test data for the project. Think about creating images with significant depth range, macro images, or images illustrating unusual optical effects. Post your results on the course wiki as for the other assignments. Also determine an image-space basis for the microlens grid for the camera you used and post it with your image.

We have about 1 camera per 2.5 project groups, so please keep the camera for only one day and/or coordinate passing the cameras from group to group (but keep Randy apprised of who has them).

Hand in Step 0 by Thursday 25 October.

Step 1. Write a program that can extract slices of constant \((u,v)\) from the light field. Because the microlens spacing is not a multiple of the pixel spacing on the sensor, this involves a resampling operation, both in \((s,t)\)—corresponding to interpolating over the hex grid—and in \((u,v)\)—corresponding to interpolating between pixels in a microlens image.

To demonstrate your success at Step 1, make an image like Figure 3.5 in Ren Ng's thesis. For the Lytro camera it will of course show a different large-scale structure, because of the hex grid and because the microlens images are packed in considerably tighter in the commercial product than in his prototype. You might enjoy also making a simple UI that lets you explore these \((u,v)\) slices interactively, or even making stereograms using images extracted from different \(u\) coordinates.

Step 1a. Complete this step if you have a vision or graphics background. Modify your program so that, rather than shifting the camera with the image plane staying fixed, the camera orbits around the point at the center of the \((s,t)\) plane. This entails working out the homography (aka. projective map) between the rotated image plane and the \((s,t)\) plane and composing it with the mapping you are already doing. (It might be easier to do it as a second resampling step, which would also be fine.) Generate images at the extremes of the movement range with and without the reprojection.

Step 2. Extend your program to extract slices from the light field corresponding to different camera positions that are not on the \((u,v)\) plane. As we discussed in lecture, this operation can be done based on the same light field resampling code, but looking up different linear subspaces of the 4D light field. You can find the requisite formulas for locating the right subspace, in one form or another, in [Isaksen et al. 2000], Ren Ng's thesis, or in your notes from lecture.

Demonstrate your results by generating images at the extremes of forward and backward motion along the optical axis that can be done without really losing the corners of the image.

Step 3. Extend your program to integrate over a circular region in the \((u,v)\) plane (that is, the camera's aperture) to produce less-noisy images with shallow depth of field. As we discussed in lecture, this operation corresponds to integrating along a 2D subspace for each pixel. The subspaces can be calculated using the same equations, moving the \((s,t)\) plane rather than the \((u,v)\) plane. Include an adjustment that controls the size of the circular region, so that the camera can be virtually stopped down continuously from the full aperture all the way to a point sample (which corresponds to a single subaperture image as in step 1).

Demonstrate your results by generating images focused at different distances and with different aperture sizes.

To hand in, write a complete report following the guidelines above, including all the images you generated. Use a variety of test images, and show results from more than one image whenever it is illustrative. The writeup will have a lot of images, and you may want to lay it out using smaller thumbnails inline that link to the full-res images. For consistency make the images 600 by 600 pixels.

Presentations

Each student will present and lead discussion of a research paper from the recent computational photography literature. These presentations will be scattered throughout the semester.

About CS6640

Questions, help, discussion: The instructor is available to answer questions, advise on projects, or just to discuss interesting topics related to the class at office hours (see my web page) and by informally scheduled meetings as needed. For electronic communication we are using Piazza (handy link also at the top of this page). Please sign yourself up for the class! Piazza has been a great medium for getting questions answered quickly.

Academic integrity: I assume the work you hand in is your own, and the results you hand in are generated by your program. You're welcome to read whatever you want to learn what you need to do the work, but I do expect you to build your own implementations of the methods we are studying. If you're ever in doubt, just include a citation in your code or report indicating where some idea came from, whether it be a classmate, a web site, another piece of software, or anything—this always maintains your honesty, whether the source was used in a good way or not.

Bibliography

A very under-construction list of the books, research papers, and other places where you can read more about the material we have been or will soon be covering.

Adams, Ansel. The Camera. Little, Brown, 1980.

Adams, Ansel. The Negative. Little, Brown, 1980.

Adams, Ansel. The Print. Little, Brown, 1980.

Burch, J.M. Matrix Methods in Optics. Dover, 1975. —a somewhat old-fashioned book but with a nice explanation of gaussian optics from a matrix standpoint.

Kingslake, Rudolf. Optics in Photography. SPIE Press, 1992. —a nice readable account of traditional lens design.

London, Stone, and Upton. Photography, 10e. Prentice Hall, 2011. —a good general text on photography, including how to use cameras and other tools effectively and how to make compelling images.

Ng, Ren. Digital Light Field Photography. PhD thesis, Stanford University. —a very well written and illustrated discussion of light field cameras and many of the implications of this technology.

Szeliski, Richard. Computer Vision: Algorithms and Applications. Springer, 2010. (also online) —a good general reference on computer vision methods, with particularly good coverage of topics relevant to photography.