[>41;285;0c> Object Detection

CS4670/5670: Computer Vision, Fall 2012
Project 5:  Object Detection



The goal of this project is to implement a simple, effective method for detecting pedestrians in an image. You will be working off of the technique of Dalal and Triggs (PDF) from 2005. This technique has three main components:
  1. A feature descriptor. We first need a way to describe an image region with a high-dimensional descriptor. For this project, you will be implementing two descriptors: tiny images and histogram of gradients (HOG) features.
  2. A learning method. Next, we need a way to learn to classify an image region (described using one of the features above) as a pedestrian or not. For this, we will be using support vector machines (SVMs) and a large training dataset of image regions containing pedestrians (positive examples) or not containing pedestrians (negative examples).
  3. A sliding window detector. Using our classifer, we can tell if an image region looks like a pedestrian or not. The final step is to run this classifier as a sliding window detector on an input image in order to detect all instances of pedestrians in that image.
Using our skeleton code as a starting point, you'll be implementing parts of all three of these components, and evaluating your methods by creating precision-recall (PR) curves.




Generating project files with CMake

This project uses cmake to generate compilation files from a set of project description files CMakeLists.txt. For those unfamiliar with cmake you can find out more about it in this wiki. cmake searches for dependencies and can automatically generate compilation instructions in the form of Make files, Visual Studio project files, XCode project files, etc (run cmake -h to see a full list of project formats). The basic procedure for generating these files is to first create directory where the compilation files will go
>> cd path/with/source
>> mkdir build
>> cd build
and then running cmake inside the build directory. The simplest form is
>> cmake .. # Assuming here you are inside the previously created build directory
the command will search for dependencies and generate a Makefile. Now, if you have no errors you can build the project with
>> make
if you are getting compilation errors related to linking and headers that were not found it might useful to run
>> VERBOSE=1 make
this will output all commands that cmake is running (normally it only prints out which file it is currently working on). cmake can also generate build instructions in debug and release modes; you can get it to do this as follows:
>> cmake -DCMAKE_BUILD_TYPE=Debug ..
>> cmake -DCMAKE_BUILD_TYPE=Release ..


The following suggestions assume that you are using the cmake GUI (cmake-gui) to generate a Visual Studio project. In our experience, cmake will likely fail the first time you try to run cmake because it will not be able to find the include and lib directories for libjpeg. If you don't already have the libjpeg library, it can be obtained from GnuWin32 (install the complete package). Once you have it installed libjpeg for Windows, you can tell CMake where the include and library files are by clicking on JPEG_INCLUDE_DIR and JPEG_LIBRARY and specifying the correct paths. If you used the GnuWin32 installer they should be
C:\Program Files\GnuWin32\include
C:\Program Files\GnuWin32\lib\jpeg.lib
UPDATE: we now recommend using libjpeg-turbo under Windows, rather than the GnuWin32 implementation of libjpeg, as the latter seems to be out-of-date and buggy with new versions of Visual Studio. libjpeg-turbo is available from Sourceforge here. It extracts to a custom path, so you will need to update your JPEG_INCLUDE_DIR and JPEG_LIBRARY cmake variables to point to the propery include directory and .lib file where you extract the code. In our case, we used jpeg-static.lib to avoid the hassle of dealing with an extra dll.

Once these paths are corrected, click on configure and then generate to create Visual Studio files. You still might get compilation errors related to lib jpeg header files not being found. To fix this select the subprojects jpegrw, objectdetect, and image. Right click on them and select "Properties". In Configure Properties -> C/C++ set the search path in "Additional Include Directories" and click apply.

Using the software

This project has no GUI; all parts of the project can be run on the command line, executing the objectdetect binary with one of several modes as the first argument (including FEATVIZ, TRAIN, PRED, PREDSL, and SVMVIZ). The first TODO item you will implement consists of feature extraction, either TinyImage or HOG. You can test to see your code by running the following command
>> objectdetect FEATVIZ hog test.jpg test_hog.jpg
This will extract a HOG feature (you can also try tinyimg) for the image test.jpg and generate a graphical representation that is saved to test_hog.jpg. If you do this with the solution executable you should get
Once your feature extraction code is running correctly you will train a linear SVM to classify image as containing pedestrians or not. This is done with the following command
>> objectdetect TRAIN pedestrian_train.dataset hog hog.svm
This will load the set of images specified in the dataset file pedestrian_train.dataset, extract a HOG feature for each one of them, and then train the SVM classifier. The .dataset file contains a list of filenames and the class of each image. A +1 before the filename indicates a file that contains a pedestrian, while -1 indicates that there are no pedestrians. Finally, the program will save the trained model into the file hog.svm.

You can get an intuition to what the SVM model is doing by visualizing the set of weights it found. To do this you can run the command

>> objectdetect SVMVIZ hog.svm svmhog.jpg
This will generate the following image
Here the left side, in red, shows a visualization of negative weights; these are edge orientations that should not be present in an image region containing a pedestrian. For instance, observe the horizontal edges in the region of the legs. On the right, in green, are the positive weights showing edge orientations that should be present in images of pedestrians.

Once we train an SVM classifier, you will test it to measure how well it performs. You will do this by classifing a set of images that were not present in the set of images used for training; this will measure how well the model you trained generalizes to other images of the same class. We provide a second .dataset file with a separate set of images to use for testing. To test your classifier, run the command:

>> objectdetect PRED pedestrian_test.dataset hog.svm hog.pr hog.preds
This will print out the average precision, and generate two files: hog.pr contains the precision recall curve and hog.preds the classifier output for each image in the same format as the .dataset file. You can inspect this second file to find out to which class each example was assigned to. To visualize the precision-recall curves we provide a MATLAB script plot_pr.m that can plot an arbitrary number of .pr files together (if you don't have MATLAB you can also try using this script with Octave, a freely available alternative that is mostly code compatible; you can also try using Gnuplot, the tool you used in Project 2). To generate the plots you can call the script like
MATLAB>> plot_pr('PR curve', 'tinyimg.pr', 'TinyImg', 'hog.pr', 'HOG', 'output', 'pr.eps')
The first argument is the plot title; this is followed by a list of pairs containing the .pr file together with the curve name (which will show up in the plot legend), and finally you can optionally specify an output image with the 'output' option followed by the output filename. An example of the precision-recall curve for the solution code is show below:

Sliding window detector

So far we have trained and tested the classifier on cropped images. A more realistic use is to run the classifier on an uncropped image, evaluating for every possible location (and potentially scale) wether there is an instance of the object of interests or not. You can do this by runnign the command
>> objectdetector PREDSL test.jpg hog.svm test_score.jpg
This will run the classifier with the model in hog.svm on every position (but only on a single scale) of the input image test.jpg and save a heat map of the classifier output into test_score.jpg. Here's a pair of input and output from the solution code with the HOG feature
you can see bright white spots on three of the people in the scene.



In addition to the code, you'll need to turn in a zipfile with your trained detectors, along with a webpage, as the artifact. Your zipfile should contain the following items:

Extra credit

Here are some ideas of things you can implement for extra credit (some of these are described in the Dalal and Triggs paper):

Last modified on November 24, 2012