The goal of this project is to enable effective remote monitoring of built-up areas with substantial pedestrian and vehicular traffic, by providing fast, reliable systems for semi-automated detection and classification of events in video data. The primary task that we plan to address is the detection and classification of vehicles, for purposes of alerting a human operator to potential threats posed by these vehicles. The proposed systems will run on standard computing platforms such as multi-processor PentiumPro or UltraSPARCs. The systems will make use of new image understanding (IU) techniques for event detection and classification based on shape, motion, color, and spatial configuration (groups of objects moving together). The central focus of the proposed project is on the development of systems for end-to-end tasks, and not solely on the IU techniques themselves. For example, in the first year we propose to develop a system for detecting and counting vehicles of particular classes, or that follow a particular route. Such a system could be used to keep track of the number of trucks entering or leaving a given site, or to warn if a truck remained in a given area too long. The choice of end-to-end tasks such as vehicle counting is motivated in part by the fact that this will enable independent system testing, by comparing system output with results from human observers.
There are three components to our approach: (i) developing robust algorithms using non-parametric statistical techniques, (ii) exploiting task-specific contextual constraints to achieve high accuracy at video rates, and (iii) using algorithmic tools to produce methods that run quickly on standard multi-processor workstations. Reliable systems, particularly for compressed video, require algorithms that are stable and predictable under varying image conditions. We have had success developing algorithms based on non-parametric statistical measures (e.g., Hausdorff-based image matching and rank/census transform-based motion). These measures are efficiently computable, provide stable results and do not require explicit models of outliers. This makes them well suited to unanticipated image distortions. We plan to develop new non-parametric segmentation and classification methods that combine motion with spatial and color cues.
Contextual information provides strong sources of constraint for achieving more accurate results and reducing processing requirements. However, an overly strong reliance on context can result in systems that are too specialized. We propose to investigate the tradeoff between system generality and performance, as more contextual information is added. For instance, the locations of roadways and traffic rules constrain vehicle configurations, enabling faster and more accurate system performance as long as the constraints are correct. We plan to develop techniques for determining when such contextual constraints are violated, enabling a system to relax the constraints for more accurate but slower performance in such cases.
Project kickoff (a brief presentation from theDARPA PI meeting in Williamsburg VA, November 6-8 1996, is available).