Feature Extraction and Tracking

Feature Extraction and Tracking is a very important part of the whole algorithm and our ability to extract 3D co-ordinates of the object being tracked depends on the correct and reliable tracking of the features.

Feature Extraction or identifying good features to track is a relatively young area of research and many research groups are working on it. We experimented with a few of the known feature extractors, namely Kanade, Lucas Tomasi feature extractor and also devised our own algorithm. Due to are a few functional problems about our algorithm (described in detail, below) though we decided not to use it in our application we our very optimistic about it's capabilities and plan to incorporate it in our application.

Feature Extractor

Finding Intensity Difference

Our approach to feature extraction is based on calculating the difference in intensity level of each pixel in an image, compared to it's surrounding pixels. The image is shifted against itself by a certain shift radius R1 (A good value for the radius is some integer multiple of the radius of the search window). The difference in the intensity of the pixels in the shifted image and the corressponding pixels in the original image is found. A sum of the square of this difference, SID (Sum of Intensity Differences), is found in a window, which has a radius R2, equal to an integer multiple of the SSD window (We chose the radius of this window as an integer multiple of the SSD window because we are using SSD to track the points).

sid.jpg (7220 bytes)

As the image moves across the shift window from -Shift Radius to +Shift Radius we sum all the sums of the intensity differences and store in an array. Each position in an array contains the Sum of the SIDs (SSID), for each corressponding pixel in the image.

Selecting Good Points

Now comes the tricky part of the selecting the points. A straight forward intuitive way is to choose the pixels (Depending upon how many points we need) with the highest value of the SSID. But as we found this approach does not work. Although the approach returns the best points in the image but because of the fact that points with the highest intensity variation in the surrounding region usually occur in bands and therefore it is very likely that the points returned are close to each other or grouped togather. When trying to extract a 3D model of an object this is the exact thing that we don't want to happen. We want the points to be distributed evenly across the object.

An approach we tried to go around this problem was to assign keep a threshold radius so that no points within the window transcribed by that threshold radius are returned. This approach works but now we introduce a factor which varies from image to image, object to object (Object we are trying to track), and largely depends upon the size of the image/object. We therefore introduce an undesired manually controlled variable.

Another approach is to keep a difference threshold, i.e., compare the difference between the SSIDs of two points against a certain threshold value and only select points which have this difference greater than the threshold value when compared with all the previously selected points. This approach although takes care of banding but at the same time rejects points which may be further apart but may have similar intensity variations.


Feature Tracker

Our feature tracker is primarily based on the Sum of Squared Differences (SSD) algorithm, with some subtle modifications.

Tracking Features using Sum of Squared Differences (SSD)

The second frame F2 (the frame in which the features are to be searched) is shifted from -R1 to +R1 over the previous frame F1, where R1 is the search window radius. Search window radius is taken depending on the maximum motion between any two consecutive frames. For each point, square of the difference in the intensity values between each corresponding pixel within the SSD window, radius R2, summed together gives the SSD. The SSD is similarly computed for all the feature points being tracked for every shift. As the number of feature points that we track across images is much less compared to the total number of pixels in the image we just need to compute SSD for each feature point in the search window around each respective point.

search.jpg (163223 bytes)

Search Region

The u and v components of the shift are obtained based on the shift which results in the minimum SSD. The feature tracker is also able to detect lost points or points which could not be tracked.

tracking.jpg (73096 bytes)

Computing SSD

Identifying Lost Features

We experimented with various different ways of finding if a feature has been lost and found it to be a non-trivial problem. We realized the beauty of the human visual system, how easy it is for a human visual system to track features and identify lost ones, and how difficult it is to emulate the same qualities in an algorithm.

Another method we tried was to compare the present SSD value with that computed for the feature in the previous frame. If the present value varied by more than a certain threshold from the previous value the feature was assumed to be lost. This technique also fails in many cases as features are many times not lost but because of variations in lighting, transformations due to camera motion, unwanted noise etc. appear distorted and therefore their SSD value varies unpredictaby. The threshold therefore needs to be adjusted for each pair of frames.
One way to find if a feature has been lost is to compare the motion detected by a certain threshold value and if the motion exceeds the threshold we assume that the feature has been lost. We found that although this method worked for most of the cases but it is not always very reliable because a feature could be lost even if the motion of the feature was less than the threshold.


[ Back ] [ Home ] [ Next ]