CS4670/5670: Computer Vision, Fall 2013
Project 3: Autostitch

Brief

Assigned: Friday, October 18, 2013
Code Due: Sunday, November 3, 2013 (by 11:59pm)
Artifact Due: Monday, November 4, 2013 (by 11:59pm)
This assignment should be done in teams of 2 students.
For the Monday demos, please download this set of files: pano_demo_tests.zip.

Synopsis

In this project, you will implement a system to combine a series of photographs into a 360 degree panorama (see panorama above). You will first detect discriminating features in the images and find the best matching features in the other images, using your code from Project 2 (or SIFT). For this project, you will then automatically align the photographs (determine their overlap and relative positions) and then blend the resulting photos into a single seamless panorama. You will then be able to view the resulting panorama inside an interactive Web viewer. To start your project, you will be supplied with some test images and skeleton code you can use as the basis of your project and instructions on how to use the viewer.

The project will consist of a pipeline of command line EXE programs (Features.exe and Panorama.exe) that will operate on images or intermediate results to produce the final panorama output. You should already be familiar with Features.exe, so we focus on Panorama.exe here.

The steps required to create a panorama are listed below. You will be creating two ways to stitch a panorama: using translations (where you'll need to pre-spherically-warp the input images) and homographies, where you align the input images directly. The steps in square brackets are only used with the spherical warping route:

	Step	EXE
1.	Take pictures on a tripod (or handheld)
2.	[Warp to spherical coordinates]	(Panorama.exe)
3.	Extract features	(Features.exe)
4.	Match features	(Features.exe)
5.	Align neighboring pairs using RANSAC	(Panorama.exe)
6.	Write out list of neighboring translations	(Panorama.exe)
7.	Correct for drift	(Panorama.exe)
8.	Read in [warped] images and blend them	(Panorama.exe)
9.	Crop the result and import into a viewer

Downloads

Skeleton code

git

>> git clone http://www.cs.cornell.edu/courses/cs4670/2013fa/projects/p3/skeleton.git

skeleton

>> git pull

git

Here's

For those that are already using git to work in groups, you can still share code with your partner by having multiple masters to your local repository (one being this original repository and the other some remote service like github where you host the code you are working on); here's a reference with more information.

Solution executables for: Windows, Linux, Mac
Test sets: yosemite (see stitch4.txt), test1 (includes a script for processing the images), test2

Getting Things to Run

Sample solution

Panorama.exe is a command line program that requires arguments to work properly. Thus you need to run it from the command line, or from a shortcut to the executable that has the arguments specified in the "Target" field of the shortcut properties. (Unlike Features.exe from last time, Panorama.exe has no GUI mode.)

From the command line

To run from the command line, click the Windows Start button and select "Run". Then enter "cmd" in the "Run" dialog and click "OK". A command window will pop up where you can type DOS commands. Use the DOS "cd" (change directory) command to navigate to the directory where Features.exe or Panorama.exe is located. Then type "Features" or "Panorama" followed by your arguments. If you do not supply any arguments, the program will print out information on what arguments it expects or open the UI in the case of Features.exe.

From a shortcut

Another way to pass arguments to a program is to create a shortcut to it. To create a shortcut, right-click on the executable and drag to the location where you wish to place the shortcut. A menu will pop up when you let go of the mouse button. From the menu, select "Create Shortcut Here". Now right-click on the short-cut you've created and select "Properties". In the properties dialog select the "Shortcut" tab and add your arguments after the text in the "Target" field. Your arguments must be outside of the quotation marks and separated with spaces.

Skeleton program

You can run the skeleton program from inside Visual Studio. However, you will need to tell Visual Studio what arguments to pass. Here's how:

Select the "ImageLib" project in the Solution Explorer (do NOT select the "Panorama" project, for some reason this won't work).
From the "Project" menu choose "Properties" to bring up the "Property Pages" dialog.
Select the "Debugging" Property page.
Enter your arguments in the "Command Arguments" field.
Click "Ok".
Now when you execute your program from within Visual Studio the arguments you entered will be passed to it automatically.
Repeat the above steps for the solution for the Features component.

Panorama Mosaic Stitching

You will use the feature detection and matching component to combine a series of photographs into a 360 degree panorama. Your software will automatically align the photographs (determine their overlap and relative positions) and then blend the resulting photos into a single seamless panorama. You will then be able to view the resulting panorama inside an interactive Web viewer. To start this component, you will be supplied with some test images and skeleton code. We also provide a Makefile so you can compile the code under Linux and Mac.

Taking the Pictures

Take a series of images with a digital camera mounted on a tripod or a handheld camera. For best results, overlap each image by 50% with the previous one, and keep the camera level. You can use your own camera for this or get one from us. Some cameras have a "stitch assist" mode you can use to overlap your images correctly, which only works in regular landscape mode. In order to use your camera, you have to estimate the focal length. The simplest way to do this is through the EXIF tags of the images, as described here. Alternatively, you can use a camera calibration toolkit to get more precise focal length and radial distortion coefficients. Finally, Brett Allen describes one creative way to measure rough focal length using just a book and a box.

Make sure the images are right side up (rotate the images by 90 degree if you took them in landscape mode), and reduce them to a more workable size (480x640 recommended). You can use external software such as PhotoShop or the Microsoft Photo Editor to do this. Or you may want to set the camera to 640x480 resolution from the start.

ToDo

Note: The skeleton code includes an image library, ImageLib, that is fairly general and complex. It is NOT necessary for you to peek extensively into this library! We have created some notes for you here.

Warp each image into spherical coordinates. (file: WarpSpherical.cpp, routine: warpSphericalField)

[TODO] Compute the inverse map to warp the image by filling in the skeleton code in the warpSphericalField routine to:

convert the given spherical image coordinate into the corresponding planar image coordinate using the coordinate transformation equation from the lecture notes
apply radial distortion using the equation from the lecture notes

(Note: You will have to use the focal length f estimates for the half-resolution images provided above (you can either take pictures and save them in small files or save them in large files and reduce them afterwards) . If you use a different image size, do remember to scale f according to the image size.)

(Note 2: This step is not used when estimating homographies between images, only translations.)

Compute the alignment of the images in pairs. (file: FeatureAlign.cpp, routines: alignPair, countInliers, and leastSquaresFit)

To do this, you will have to implement a feature-based translational motion estimation. The skeleton for this code is provided in FeatureAlign.cpp. The main routines that you will be implementing are:

CTransform3x3 ComputeHomography(const FeatureSet &f1, const FeatureSet &f2, const vector<FeatureMatch> &matches);

int alignPair(const FeatureSet &f1, const FeatureSet &f2, const vector<FeatureMatch> &matches, MotionModel m, float f, int nRANSAC, double RANSACthresh, CTransform3x3& M);

int countInliers(const FeatureSet &f1, const FeatureSet &f2, const vector<FeatureMatch> &matches, MotionModel m, float f, CTransform3x3 M, double RANSACthresh, vector<int> &inliers);

int leastSquaresFit(const FeatureSet &f1, const FeatureSet &f2, const vector<FeatureMatch> &matches, MotionModel m, float f, const vector<int> &inliers, CTransform3x3& M);

ComputeHomography takes two feature sets, f1 and f2, and a list of feature matches, and estimates a homography from the matches.

AlignPair takes two feature sets, f1 and f2, the list of feature matches obtained from the feature detecting and matching component (described in the first part of the project), a motion model (described below), and estimates and inter-image transform matrix M. For this project, the enum MotionModel takes two possible values: eTranslate and eHomography.

AlignPair uses RANSAC (RAndom SAmpling Consensus) to pull out a minimal set of feature matches (one match for the case of translations, four for homographies), estimates the corresponding motion (alignment) and then invokes countInliers to count how many of the feature matches agree with the current motion estimate. After repeated trials, the motion estimate with the largest number of inliers is used to compute a least squares estimate for the motion, which is then returned in the motion estimate M.

CountInliers computes the number of matches that have a distance below RANSACthresh is computed. It also returns a list of inlier match ids.

LeastSquaresFit computes a least squares estimate for the translation or homograpy using all of the matches previously estimated as inliers. It returns the resulting translation or homography in output transform M.

[TODO] You will have to fill in the missing code in alignPair to:

Randomly select a matching pair (or pairs) of features and compute the translation (or homography) that relates the minimal sample of feature locations.
Call countInliers to count how many matches agree with this estimate.
Repeat the above random selection nRANSAC times and keep the estimate with the largest number of inliers.
Write the body of countInliers to count the number of feature matches where the SSD distance after applying the estimated transform (i.e. the distance from the match to its correct position in the image) is below the threshold. (and don't forget to create the list of inlier ids.)
Write the body of leastSquaresFit, which for the simple translational case is just the average displacement between the matching feature positions; for homographies you will write the following:
Fill in the missing pieces of ComputeHomography, which will be called from several places in your code.

(Note 3: In ComputeHomography, you will compute the best-fit homography using the Singular Value Decomposition. From lecture 11: "the solution h is the eigenvector of A'A with smallest eigenvalue." Recall that the SVD decomposes a matrix by A=USV' where U and V are the left and right singular vectors, and S is a diagonal matrix of singular values, conventionally ordered from largest to smallest. Furthermore, there is a very strong connection between singular vectors and eigenvectors. Consider: A'A = (VSU')(USV') = V(S^2)V'. That is, right singular vectors of A are eigenvectors of A'A, and eigenvalues of A'A are the squares of singular vectors of A. Returning to the problem, this means that the solution h is the right singular vector corresponding to the smallest singular value. For more details, the wikipedia article on the svd is very good.)

Stitch and crop the resulting aligned images. (file: BlendImages.cpp, routines: BlendImages, AccumulateBlend, NormalizeBlend)

[TODO] Given the warped images and their relative displacements, figure out how large the final stitched image will be and their absolute displacements in the panorama (BlendImages.)

[TODO] Then, resample each image to its final location and blend it with its neighbors (AccumulateBlend, NormalizeBlend). Try a simple feathering function as your weighting function (see mosaics lecture slide on "feathering") (this is a simple 1-D version of the distance map described in [Szeliski & Shum]). For extra credit, you can try other blending functions or figure out some way to compensate for exposure differences. In NormalizeBlend, remember to set the alpha channel of the resultant panorama to opaque!

[TODO] Crop the resulting image to make the left and right edges seam perfectly (BlendImages). The horizontal extent can be computed in the previous blending routine since the first image occurs at both the left and right end of the stitched sequence (draw the "cut" line halfway through this image). Use a linear warp to the mosaic to remove any vertical "drift" between the first and last image. This warp, of the form y' = y + ax, should transform the y coordinates of the mosaic such that the first image has the same y-coordinate on both the left and right end. Calculate the value of 'a' needed to perform this transformation.

Creating the Panorama

Use the above program you wrote to warp/align/stitch images into the resulting panorama.

To remove the radial distortion and warp the image input1.tga into spherical coordinate with focal length = 595, radial distortion coefficients k1=-0.15 and k2=0.001 (Panorama is the name of the program):

Panorama sphrWarp input1.tga warp1.tga 595 -0.15 0.001
Then, use the feature detecting and matching component to compute the features in the warped images. To align two feature sets warp1.f and warp2.f, with a set of matches in match1to2.txt, using 200 iterations of RANSAC with an outlier threshold distance of 1 pixel:

Panorama alignPair warp1.f warp2.f match1to2.txt 200 4

Note that you can also use SIFT features to do the alignment, which can be useful if the feature detection and matching code from Project 1 is not working sufficiently well. To do so, add the work sift to the end of the command, as in:

Panorama alignPair warp1.key warp2.key match1to2.txt 200 4 sift

Sample SIFT features and matches have been provided to you with the small test sequence (Yosemite) above. To extract SIFT features from a new set of images, you can download David Lowe's SIFT binary here (Windows and Linux binaries available). Note that you will need to convert images to .pgm format before running SIFT on them.

Alternatively, use alignPairHomography
Panorama alignPairHomography input1.f input2.f match1to2.txt 200 4
This does not require spherically warping the input images first, as a homography can explain the transformation between two images from rotated viewpoints. This mode can also work with sift features in the same way as alignPair.

Run the previous step for all adjacent pairs of images and save the output into a separate file pairlist.txt which may look like this:

    warp1.tga warp2.tga 1 0 213.49 0 1 -5.12 0 0 1
    warp2.tga warp3.tga 1 0 208.19 0 1 2.82 0 0 1
    ......
    warp9.tga warp1.tga 1 0 194.76 0 1 -3.88 0 0 1
Then stitch the images into the final panorama pano.tga:

Panorama blendPairs pairlist.txt pano.tga blendWidth

You may also refer to the file stitch2.txt provided along with the skeleton code for the appropriate command line syntax. This command-line interface allows you to debug each stage of the program independently.

Convert your resulting image to a JPEG (Photoshop and other tools in the Sieg lab can help you with this) and paste it on a Web page along with code to run the interactive viewer. Click here for instructions on how to do this. Click here for a web based application to convert your image to a JPEG.

Debugging Guidelines

You can use the test results included in the images/ folder to check whether your program is running correctly. Comparing your output to that of the sample solution is also a good way of debugging your program.

Testing the warping routines:

In the images/ folder in the skeleton code, a few example warped images are provided for test purposes. The camera parameters used for these examples can be found in the sample command file stitch2.txt. See if your program produces the same output.
You may also test with different input images and/or camera parameter values by comparing the results with those of the sample solution.

Testing the alignment routines:

A few example alignment results are provided in the file pairlist2/4.txt. The corresponding shell commands can be found in stitch2/4.txt.
To test alignPair only, try passing in an image that has been cropped with two different rectangles (and maybe rotated by a tiny amount, say 2 degrees).
We have also provided sample SIFT key files and matches for the test images, for testing the alignment routines.

Testing the blending routines:

An example panorama is included in the images/ folder. Compare the resulting panorama with this image.
You may also test with other panoramas by running the sample solution on different inputs.

Extra Credit

Here is a list of suggestions for extending the program for extra credit. You are encouraged to come up with your own extensions. We're always interested in seeing new, unanticipated ways to use this program!

Although the feature-based aligner gives sub-pixel motion estimation (because of least squares), the motion vectors are rounded to integers when blending the images into the mosaic in BlendImages.cpp. Try to blend images with sub-pixel localization.
Sometimes, there exists exposure difference between images, which results in brightness fluctuation in the final mosaic. Try to get rid of this artifact.
Try shooting a sequence with some objects moving. What did you do to remove "ghosted" versions of the objects?
Try a sequence in which the same person appears multiple times, as in this example.
Implement a better blending technique, e.g., pyramid blending, possion imaging blending and graph cuts

What to Turn In

First, your source code and executable should be zipped up into an archive called 'code.zip', and uploaded to CMS. In addition, turn in a web page describing your approach and results. In particular:

Panorama Mosaic Stitching

This portion of the web page should contain the following:

At least two panoramas: (1) the test sequence (test1 above), (2) one from a hand-held sequence. Each panorama should be shown as (1) a low-res inlined image on the web page, (2) a link that you can click on to show the full-resolution .jpg file, AND (3) embedded in a viewer as described above.
At least one subset of a 360 panorama where you use homographies to align the input images without first spherically warping. Try and make the panorama as wide as you can to see where this approach breaks down. Include a JPG of your homography-aligned panorama on your webpage.
A short description of what worked well and what did not. If you tried several variants or did something non-standard, please describe this as well.
Describe any extra credit.

The webpage (along with all images in JPEG format) should be uploaded to CMS in a zip file called 'webpage.zip'. If you are unfamiliar with HTML you can use any web-page editor such as FrontPage, Word, or Visual Studio 7.0 to make your web-page.

Panorama Links

Panoramas.dk: weekly archive of full-screen, high-quality panoramas worldwide
Super high resolution panoramas at GigaPan
VR Seattle: Seattle & Washington panorama
Matt Brown's Autostitch page.

Acknowledgments

The instructor is extremely thankful to Prof. Steve Seitz for allowing us to use this project which was developed in his Computer Vision class.

Last modified on October 18, 2013

CS4670/5670: Computer Vision, Fall 2013 Project 3: Autostitch