Ashutosh Saxena

Stanford University

Robotic Grasping and Depth Perception:

Learning 3D Models from a Single Image

The ability to perceive the 3D shape of the environment is a basic ability for a robot. We present an algorithm to convert standard digital pictures into 3D models.


This is a challenging problem, since an image is formed by a projection of the 3D scene onto two dimensions, thus losing the depth information. We take a supervised learning approach to this problem, and use a Markov Random Field (MRF) to model the scene depth as a function of the image features.  We show that,  even on unstructured scenes of a large variety of environments, our algorithm is frequently able to recover accurate 3D models.


We then apply our methods to robotics applications: (a) obstacle avoidance for autonomously driving a small electric car, and (b) robot manipulation, where we develop vision-based learning algorithms for grasping novel objects. This enables our robot to perform tasks such as open new doors, clear up cluttered tables, and unload items from a dishwasher.


Ashutosh Saxena is a PhD candidate in AI Lab at Stanford University. His research interests include machine learning, robotics and perception. He received his undergraduate degree from Indian Institute of Technology (IIT) Kanpur in 2004 and Masters in Electrical Engineering from Stanford University in 2006. He has worked in Bose Corporation, CSIRO (Australia) and Microsoft in the past.


During his PhD, Ashutosh has developed Make3D (, an algorithm that converts a single photograph into a 3D model. Tens of thousands of users used this technology to convert their pictures to 3D. He has also developed algorithms that enable robots to perform household chores such as unload items from a dishwasher. His work has received substantial amount of attention in popular press, including the frontpage of New York Times, BBC, ABC, New Scientist and Wired Magazine. He has won best paper awards in 3DRR and IEEE ACE. He was also a recipient of National Talent Scholar award in India.



B17 Upson Hall

Thursday, March 12, 2009

Refreshments at 3:45pm in the Upson 4th Floor Atrium

Computer Science


Spring 2009