Yan Wang, Wei-Lun Chao, Divyansh Garg, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger will present their findings at the conference on Computer Vision and Pattern Recognition (CVPR) in Long Beach, California this June.
In their paper, “Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection from Autonomous Driving,” Wang, et al., find there is potential for efficient 3D object detection in self-driving cars using stereo cameras instead of relying on LiDAR sensors. As the team reports, “the use of LiDARs is somewhat controversial because they are very expensive and significantly worsen a vehicle’s drag coefficient (an effect that is particularly hard on electric cars). Consider that Tesla has for years shipped cars with cameras—but without LiDARs—in the expectation that research on stereo-based 3D object detection would catch up and close the accuracy gap.” Wang, et al., may have taken a significant step in the direction of fulfilling this hope.
The authors contend that the large jump in accuracy is obtained—literally—by a change in perspective. Wang, et al. realized that one reason neural networks have a hard time identifying the exact locations of 3D objects is that they are usually made to observe scenes from a human driver’s point of view. This first-person perspective is particularly challenging for the way neural networks identify objects in a scene, namely, through repeated applications of filters. However, neural networks, which function differently from human perception, are not restricted to any particular viewpoint and instead can similarly observe a given scene from a bird’s-eye view. As Wang, et al. predicted, this simple transformation increased the accuracy dramatically, not just for their own algorithm but for all methods that they tested.
Click here to watch a video that showcases 3D object detection solely based on stereo camera data.
The Cornell research team includes [@cornell.edu]:
Yan Wang [yw763], Computer Science
Wei-Lun Chao [wc635], Computer Science
Divyansh Garg [dg595], Computer Science
Bharath Hariharan [bh497], Computer Science
Mark Campbell [mc288], Cornell Research, College of Engineering
Kilian Q. Weinberger [kilian], Computer Science