I develop machine learning models and algorithms that enable systems to interact with people and their environments in the unstructured real world. It usually means combining computer vision, robotics and natural language processing techniques with a good dose of machine learning. I enjoy building multi-modal models that combine the power of deep learning with principled known-to-work methods in robotics and vision.
Some buzzwords involved in my research: Reinforcement Learning, Imitation Learning, Semi-supervised learning, Language Grounding, 3D Vision, Meaning Representations, Transfer Learning, Sim2Real, Mapping, Planning and I guess Deep Learning.
Mapping Navigation Instructions to Continuous Control Actions with Position-Visitation Prediction (CoRL 2018) Valts Blukis, Dipendra Misra, Ross A. Knepper and Yoav Artzi
TLDR; We present a 2-stage approach to following natural language navigation instructions on a realistic simulated quadcopter by mapping first-person images to actions. In the first planning stage, we learn perception, language understanding, and language grounding by predicting where the quadcopter should fly. In the second execution stage, we learn control using imitation learning. The model outperforms prior methods and uses the differentiable mapping system from GSMN (two projects below). [PDF] [Bibtex] [YouTube Demo]
Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction (EMNLP 2018) Dipendra Misra, Andrew Bennett, Valts Blukis, Max Shatkhin, Eyvind Niklasson and Yoav Artzi
TLDR; We propose a single model approach for instruction following, which decouples the problem into predicting the goal location using visual observations and taking actions to accomplish it. We propose a new model for visual goal prediction and introduce two large scale instruction following datasets involving navigation in an open 3D space and performing navigation and simple manipulation in a 3D house. [PDF] [arXiv] [Bibtex]
Following High-Level Navigation Instructions on a Simulated Quadcopter with Imitation Learning (RSS 2018) Valts Blukis, Nataly Brukhim, Andrew Bennett, Ross A. Knepper and Yoav Artzi
TLDR; We present the Grounded Semantic Mapping Network (GSMN) that embeds a differentiable mapping system within a neural network model. It accumulates a learned, internal map of the environment and uses this map to solve a navigation instruction following task by mapping directly from first-person camera images to velocity commands. The mapper works by projecting learned image features from the first-person view to the map reference frame, allowing our model to outperform traditional end-to-end neural architectures. [PDF] [arXiv] [Poster] [Slides] [Bibtex]
Socially Competent Navigation Planning by Deep Learning of Multi-Agent Path Topologies (IROS 2017) Christoforos I. Mavrogiannis, Valts Blukis and Ross A. Knepper
TLDR; If we can anticipate how pedestrians around us will act, we can efficiently navigate around them. We behave in a legible way by minimzing the uncertainty (entropy) of the distribution over future pedestrian trajectory topologies and thus avoiding confusing behavior. Topologies are defined using braid theory and predicted with a seq2seq model. [PDF] [Bibtex]
Waiterbot (1st Runner-up, Tech Factor Challenge 2014-2015)
Waiterbot is a hand-built prototype for a fully featured restaurant robot. We built it in a team of 4 enthusiasts (Azfer Karim, Jagapriyan, Mohamed Musadhiq and me) for the 2014-2015 Tech Factor Challenge competition in Singapore. It can navigate from table to table, pick up and deliver orders and even serve water. It uses a custom-built gripper to pick up cups, bowls and plates (although it struggles with crockery).
Battlefield Extraction Robot (Grand Prize, Tech Factor Challenge 2013-2014)
ABER (Autonomous Battlefield Extraction Robot) is a rescue robot that can automatically find and pick up a casualty in a war or disaster scenario, where limited remote operation is possible. It's built on ROS and uses a unique conveyor-belt pickup system that inflicts minimum damage in case of bone fractures. Received DSTA Gold Medal for best final-year project in NTU, School of Electrical and Electronic Engineering.