In this project, you will write code to detect discriminating features in an image and find the best matching features in other images. Your features should be reasonably invariant to translation, rotation, and illumination. You will evaluate their performance on a suite of benchmark images. Watch this video to see the solution in action. You can also look at the slides presented in class.
In Project 3, you will apply your features to automatically stitch images into a panorama.
To help you visualize the results and debug your program, we provide a user interface that displays detected features and best matches in other images. We also provide an example ORB feature detector, a current best of breed technique in the vision community, for comparison.
Finally, to make programming easier, this assignment will be in Python, with the help of NumPy and SciPy. Python+NumPy+SciPy is a very powerful scientific computing environment, and makes computer vision tasks much easier. A crash-course on Python and NumPy can be found here.
The project has three parts: feature detection, feature description, and feature matching.
In this step, you will identify points of interest in the image using the Harris corner detection method. The steps are as follows (see the lecture slides/readings for more details) For each point in the image, consider a window of pixels around that point. Compute the Harris matrix H for (the window around) that point, defined as

where the summation is over all pixels p in the
window.
is the x derivative of the image at point p, the notation is similar for the y derivative. You should use the Sobel operator to compute the x, y derivatives. The weights
should
be chosen to be circularly symmetric (for rotation invariance). You should use a 5x5 Gaussian mask with 0.5 sigma.
Note that H is a 2x2 matrix. To find interest points, first compute the corner strength function

Once you've computed c for every point in the image, choose points where c is above a threshold. You also want c to be a local maximum in a 7x7 neighborhood. In addition to computing the feature locations, you'll need to compute a canonical orientation for each feature, and then store this orientation (in degrees) in each feature element. To compute the canonical orientation at each pixel, you should compute the gradient (using the Sobel operator again) of the blurred image (with a Gaussian kernel with 7x7 window and 0.75 sigma) and use the angle of the gradient as orientation.
Now that you've identified points of interest, the next step is to come up with a descriptor for the feature centered at each interest point. This descriptor will be the representation you'll use to compare features in different images to see if they match.
You will implement two descriptors for this project. For starters, you will implement a simple descriptor, a 5x5 square window without orientation. This should be very easy to implement and should work well when the images you're comparing are related by a translation. Second, you'll implement a simplified version of the MOPS descriptor. You'll compute an 8x8 oriented patch sub-sampled from a 40x40 pixel region around the feature. You have to come up with a transformation matrix which transforms the 40x40 rotated window around the feature to the 8x8 patch, with a canonical orientation where 0 degree corresponds to the (1, 0) direction vector, i.e. the vector points to the right. You should also normalize the patch to have zero mean and unit variance.
Now that you've detected and described your features, the next step is to
write code to match them, i.e., given a feature in one image, find the best
matching feature in one or more other images. This part of the feature
detection and matching component is mainly designed to help you test out your
feature descriptor. You will implement a more sophisticated feature
matching mechanism in the second component when you do the actual image
alignment for the panorama.
The simplest approach is the following: write a procedure that
compares two features and outputs a distance between them. For
example, you could simply sum the absolute value of differences between the
descriptor elements. You could then use this distance to compute the best match
between a feature in one image and the set of features in another image by
finding the one with the smallest distance. Two possible distances are:
Now you're ready to go! Using the UI and skeleton code that we provide, you can load in a set of images, view the detected features, and visualize the feature matches that your algorithm computes.
By running featuresUI.py, you will see a GUI where you have the following choices:
You may use the UI to perform individual queries to see if things are working right. First you need to load a query image check if the detected key points have reasonable number and orientation under the Keypoint Detection tab. Then, you can check if your feature descriptors and matching algorithms work using the Feature Matching tab.
We are providing a set of benchmark images to be used to test the performance of your algorithm as a function of different types of controlled variation (i.e., rotation, scale, illumination, perspective, blurring). For each of these images, we know the correct transformation and can therefore measure the accuracy of each of your feature matches. This is done using a routine that we supply in the skeleton code.
You should also go out and take some photos of your own to see how well your approach works on more interesting data sets. For example, you could take images of a few different objects (e.g., books, offices, buildings, etc.) and see if it can "recognize" new images.
For those that are already using git to work in groups, you can still share code with your partner by having multiple masters to your local repository (one being this original repository and the other some remote service like github where you host the code you are working on); here's a reference with more information.
This skeleton code should run on the provided VM, without any further installation. We provide an installation script if you want to run it on a different machine. Download some image sets for benchmark: datasets.We have given you a number of classes and methods to help get you started. We strongly advise you to read the documentation above the functions, which will help you understand the inputs and outputs of the functions you have to implement. The code you need to write will be for your feature detection methods, your feature descriptor methods and your feature matching methods, all in features.py. The feature computation and matching methods are called from the GUI functions in featuresUI.py.
Here is a summary of potentially useful functions (you do not have to use any of these):
scipy.ndimage.sobelscipy.ndimage.gaussian_filterscipy.ndimage.filters.convolvescipy.ndimage.filters.maximum_filterscipy.spatial.distance.cdistcv2.warpAffinenp.max, np.min, np.std, np.mean, np.argmin, np.argpartitionnp.degrees, np.radians, np.arctan2transformations.get_rot_mx (in transformations.py)transformations.get_trans_mxtransformations.get_scale_mxFirst, your source code and executable should be zipped up into an archive called 'code.zip', and uploaded to CMS. In addition, turn in a web page describing your approach and results (in an archive called 'webpage.zip'). In particular:
We will tabulate the best performing features and present them to the class.
The web-page (.html file) and all associated files (e.g., images, in JPG format) should be placed in a zip archive called 'webpage.zip' and uploaded to CMS. If you are unfamiliar with HTML you can use any web-page editor such as FrontPage, Word, or Visual Studio 7.0 to make your web-page.
Here is a list of suggestions for extending the program for extra credit. You are encouraged to come up with your own extensions as well!
Design and implement your own feature descriptor.
Implement adaptive
non-maximum suppression (MOPS paper)
Make your feature detector
scale invariant.
Implement a method that
outperforms the above ratio test for deciding if a feature is a valid
match.
Use a fast search algorithm
to speed up the matching process. You can use code from the web or
write your own (with extra credit proportional to effort). Some
possibilities in rough order of difficulty: k-d trees (code
available here), wavelet
indexing (see the MOPS paper), locality-sensitive
hashing.
Last modified on February 21, 2015