CS4670/5670: Computer Vision, Fall 2013
Project 2:  Feature Detection and Matching



In this project, you will write code to detect discriminating features in an image and find the best matching features in other images.  Your features should be reasonably invariant to translation, rotation, and illumination, and you'll evaluate their performance on a suite of benchmark images.

In Project 3, you will apply your features to automatically stitch images into a panorama. 

To help you visualize the results and debug your program, we provide a user interface that displays detected features and best matches in other images.  We also provide sample feature files that were generated using SIFT, a current best of breed technique in the vision community, for comparison.


The project has three parts:  feature detection, description, and matching.

1. Feature detection

In this step, you will identify points of interest in the image using the Harris corner detection method.  The steps are as follows (see the lecture slides/readings for more details)  For each point in the image, consider a window of pixels around that point.  Compute the Harris matrix H for that point, defined as

where the summation is over all pixels p in the window.  The weights should be chosen to be circularly symmetric (for rotation invariance).  A common choice is to use a 3x3 or 5x5 Gaussian mask. Note that these weights were not discussed in the lecture slides, but you should use them for your computation.

Note that H is a 2x2 matrix.  To find interest points, first compute the corner strength function

Once you've computed c for every point in the image, choose points where c is above a threshold.  You also want c to be a local maximum in at least a 3x3 neighborhood (we found that 5x5 window works better). In addition to computing the feature locations, you'll need to compute a canonical orientation for each feature, and then store this orientation (in radians) in each feature element.

2. Feature description

Now that you've identified points of interest, the next step is to come up with a descriptor for the feature centered at each interest point.  This descriptor will be the representation you'll use to compare features in different images to see if they match.

You will implement two descriptors for this project. For starters, you will implement a simple descriptor, a 5x5 square window without orientation. This should be very easy to implement and should work well when the images you're comparing are related by a translation. Second, you'll implement a simplified version of the MOPS descriptor. You'll compute an 8x8 oriented patch sub-sampled from a 41x41 pixel region around the feature. You should also normalize the patch to have zero mean and unit variance.

3. Feature matching

Now that you've detected and described your features, the next step is to write code to match them, i.e., given a feature in one image, find the best matching feature in one or more other images.  This part of the feature detection and matching component is mainly designed to help you test out your feature descriptor.  You will implement a more sophisticated feature matching mechanism in the second component when you do the actual image alignment for the panorama.

The simplest approach is the following:  write a procedure that compares two features and outputs a distance between them.  For example, you could simply sum the absolute value of differences between the descriptor elements. You could then use this distance to compute the best match between a feature in one image and the set of features in another image by finding the one with the smallest distance. Two possible distances are:

  1. Use a threshold on the match score. This is called the SSD distance, and is implemented for you as match type 1, in the function ssdMatchFeatures.
  2. Compute (score of the best feature match)/(score of the second best feature match). This is called the "ratio test"; you must implement this distance.


Now you're ready to go!  Using the UI and skeleton code that we provide, you can load in a set of images, view the detected features, and visualize the feature matches that your algorithm computes.

We are providing a set of benchmark images to be used to test the performance of your algorithm as a function of different types of controlled variation (i.e., rotation, scale, illumination, perspective, blurring).  For each of these images, we know the correct transformation and can therefore measure the accuracy of each of your feature matches.  This is done using a routine that we supply in the skeleton code.

You should also go out and take some photos of your own to see how well your approach works on more interesting data sets.  For example, you could take images of a few different objects (e.g., books, offices, buildings, etc.) and see if it can "recognize" new images.

Skeleton Code

Available through git (this should help make distributing any updates easier). With git installed, you can download the code by typing (using the command-line interface to git)
>> git clone http://www.cs.cornell.edu/courses/cs4670/2013fa/projects/p2/skeleton.git
This will create the directory skeleton. To get updates to the code you can then simply run
>> git pull
This will fetch any updates and merge them into your local copy. If we modify a file you have already modified git will not overwrite your changes. Instead, it will mark the file as having a conflict and then ask you to resolve how to integrate the changes from these two sources. Here's a quick guide on how to resolve these conflicts.

For those that are already using git to work in groups, you can still share code with your partner by having multiple masters to your local repository (one being this original repository and the other some remote service like github where you host the code you are working on); here's a reference with more information.

This skeleton code should compile under Windows (Visual Studio) or Linux/Mac (with the provided Makefile).
  • Download some image sets for plotting ROC curves: graf, Yosemite.
    Included with these images are some SIFT feature files and image database files.

  • Download some image sets for benchmark: graf, leuven, bikes, wall
    Included with these images are some SIFT feature files and image database files.

  • Download the solution executable for Windows, Linux, or Mac.
    Note that you might need to use "chmod" first to make it executable on Linux or Mac.

    Note: This project will use the included ImageLib library for reading, writing, accessing, and manipulating images.

    After compiling and linking the skeleton code, you will have an executable Features This can be run in several ways:

    To Do

    We have given you a number of classes and methods to help get you started. The code you need to write will be for your feature detection methods, your feature descriptor methods and your feature matching methods, all in features.cpp. The feature computation methods are called from a function computeFeatures, and matching is called from matchFeatures in features.cpp; these will call the to the methods you will fill in.

    We have provided a function dummyComputeFeatures that shows how to create basic code for detecting features, and integrating them into the rest of the system.

    The function ComputeHarrisFeatures is one of the main ones you will complete, along with the helper functions computeHarrisValues and computeLocalMaxima. These implement Harris corner detection. You'll need to implement two feature descriptors, ComputeSimpleDescriptors and ComputeMOPSDescriptors. These functions take the location and orientation information already stored in a set of features (e.g., Harris corners), and compute descriptors for these feature points, then store these descriptors in the data field of each feature. Finally, you'll implement a function for matching features. The function ssdMatchFeatures is provided for reference, and implements a feature matcher using the SSD distance; this function demonstrates how a matching function should be implemented. You will implement the function ratioMatchFeatures, which matches features using the ratio test.

    After you've finished the computeFeatures part of this project, you may use UI to perform individual queries to see if things are working right. First you need to load a query image, and then its corresponding feature file with .f extension. Then, you'll need to load another file's database file, which is simply a .db file containing one line 'imagename featurefilename \n'. Note that the newline symbol is crucial. Then, select a subset of features individually (left-click) or in a group (right-click drag) and perform query, you should be able to see another image with matched features showing up to the right of query image.

    You will also need to generate plots of the ROC curves and report the areas under the ROC curves (AUC) for your feature detecting and matching code (using the 'roc' option of Features.exe), and for SIFT. For both the Yosemite test images (Yosemite1.jpg and Yosemite2.jpg), and the graf test images (img1.ppm and img2.ppm), create a plot with 6 curves, two using the simple window descriptor and simplified MOPS descriptor with the SSD distance, two using these two types of descriptors with the ratio test distance, and the other two using SIFT (with both the SSD and ratio test distances; these curves are provided to you in the zip files for Yosemite and graf provided above).

    We have provided scripts for creating these plots using the 'gnuplot' tool. 'gnuplot' is available as an Ubuntu or MacPorts package  or you can download a copy of Gnuplot to your own machine. To generate a plot with gnuplot (using a gnuplot script 'script.txt', simply run 'gnuplot script.txt', and gnuplot will output an image containing the plot. The two scripts we provide are:

    plot.roc.txt: plots the ROC curves for the SSD distance and the ratio test distance. These assume the two roc datafiles are called 'roc1.txt' (for the SSD distance), and 'roc2.txt' (for the ratio test distance). You will need to edit this script if your files are named differently. This script also assumes 'roc1.sift.txt' and 'roc2.sift.txt' are in the current directory (these files are provided in the zip files above). This script generates an image named 'plot.roc.png'. Again, to generate a plot with this script, simply enter 'gnuplot.exe plot.roc.txt'.

    plot.threshold.txt: plots the threshold on the x-axis and 'TP rate - FP rate' on the x-axis. The maximum of this function represents a point where the true positive rate is large relative to the false positive rate, and could be a good threshold to pick for the computeMatches step. This script generates an image named 'plot.threshold.png.'

    Finally, you will need to report the average AUC for your feature detecting and matching code (using the 'benchmark' option of Features.exe) on three benchmark sets: leuven, bikes and wall.

    What to Turn In

    First, your source code and executable should be zipped up into an archive called 'code.zip', and uploaded to CMS. In addition, turn in a web page describing your approach and results (in an archive called 'webpage.zip').  In particular:

    We will tabulate the best performing features and present them to the class.

    The web-page (.html file) and all associated files (e.g., images, in JPG format) should be placed in a zip archive called 'webpage.zip' and uploaded to CMS. If you are unfamiliar with HTML you can use any web-page editor such as FrontPage, Word, or Visual Studio 7.0 to make your web-page.

    Extra Credit

    Here is a list of suggestions for extending the program for extra credit. You are encouraged to come up with your own extensions as well!


    The instructor is extremely thankful to Prof. Steve Seitz for allowing us to use this project which was developed in his Computer Vision class.


    Last modified on September 19, 2013