Addr: Microsoft Research,
641 Avenue of Americas,
New York, New York, 10011
dkm@cs.cornell.edu

Dipendra Misra

PhD Candidate,
Department of Computer Science,
Cornell University
I am on the job market. For details please contact me.

Interning at Microsoft Research, New York in Fall 2018 with John Langford and Akshay Krishnamurthy on reinforcement learning.

Papers accepted at ICML 2018, EMNLP 2018 and CoRL 2018.

I am co-organizing the 3rd Workshop on Representation Learning for NLP (Repl4NLP) at ACL 2018. Consider submitting.

Research and Publication

I am interested in developing models and learning methods with emphasis on applications in natural language understanding such as robots that can follow instruction, answer questions or hold a conversation. I am currently active in three research areas:

  • Robot Instruction Following: Robots that can follow instructions can be of great assistance. I am interested in developing models that can train with minimal effort and generalize to new environments and instructions. See Misra et al. EMNLP 2018b, EMNLP 2017, ACL 2015a, IJRR 2015b, RSS 2014.

  • Semantic parsing: I am interested in developing semantic parsers using weak supervision with application to tasks like question answering. See Misra et al. EMNLP 2018a and Misra and Artzi EMNLP 2016.

  • Reinforcement Learning: Problems described above are often best described using a reinforcement learning framework. I am interested in developing reinforcement learning algorithms that simultaneously achieve good exploration, sample efficiency and generalization with theoretical guarantees. See Misra et al. EMNLP 2017 and Asadi, Misra, Littman ICML 2018.

I have served or serving as program committee member for natural language processing (EMNLP, ACL, EACL), machine learning (ICML, NIPS, AISTATS) and robotics conferences and journals (JAIR, ICRA, HRI, ISRR, RSS).


Preprint

Towards a Simple Approach to Multi-step Model-based Reinforcement Learning (arXiv 2018)
Kavosh Asadi, Evan Carter, Dipendra Misra, Michael Littman
Abstract: We propose a simple approach that learns a set of model to predict the kth state given the start state and a sequence of k actions. Our approach does not suffer from the cascading errors that plague model-based reinforcement learning. We demonstrate advantage to single-step models on Atari games.
[Preliminary version accepted at NIPS Deep RL workshop]
[Paper]

Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments (arXiv 2018)
Howard Chen, Alane Suhr, Dipendra Misra, Noah Snavely, Yoav Artzi
Abstract: .
[Preliminary version accepted at NIPS VIGIL workshop]
[Paper][Code]

Early Fusion for Goal Directed Robotic Vision (arXiv 2018)
Aaron Walsman, Yonatan Bisk, Saadia Gabriel, Dipendra Misra, Yoav Artzi, Yejin Choi, Dieter Fox
[Paper]


Conference

Mapping Navigation Instructions to Continuous Control Actions with Position Visitation Prediction (CoRL 2018)
Valts Blukis, Dipendra Misra, Ross A. Knepper, and Yoav Artzi
Abstract: We propose a single model approach for instruction following with continuous controls and dynamics. Our model outperforms several baseline on a recently released dataset using the AirSim simulator.
[Paper] [Code] [Demo Video]

Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations (EMNLP 2018)
Dipendra Misra, Ming-Wei Chang, Xiaodong He and Wen-tau Yih
Abstract: We propose a framework for handling spurious program errors and generalizing several learning algorithms for semantic parsing from denotation(SpFD). We propose policy shaping to bias the search away from spurious programs and introduce generalized updates to capture several learning algorithms and their novel combinations. Our framework extends SOTA by 5% on an SpFD dataset.
[Paper][Code]

Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction (EMNLP 2018)
Dipendra Misra, Andrew Bennett, Valts Blukis, Eyvind Niklasson, Max Shatkhin, and Yoav Artzi
Abstract: We propose a single model approach for instruction following, which decouples the problem into predicting the goal location using visual observations and taking actions to accomplish it. We propose a new model for visual goal prediction and introduce two large scale instruction following datasets involving navigation in an open 3D space and performing navigation and simple manipulation in a 3D house.
[Paper] [Code, Data and Simulators]

Lipschitz Continuity in Model-based Reinforcement Learning (ICML 2018)
Kavosh Asadi*, Dipendra Misra*, Michael L. Littman (* equal contribution)
Abstract: We provide novel bounds for cascading errors in model-based RL for Lipschitz continuous transition model. We further show the advantage of using Wasserstein metric for measuring distance between transition models and provide experimental and theoretical advantage for controlling Lipschitz constants of deep neural network models.
[Paper][Code]

Mapping Instructions and Visual Observations to Actions with Reinforcement Learning (EMNLP 2017)
Dipendra Misra, John Langford and Yoav Artzi
Abstract: We present an approach for mapping natural language instructions and visual observations to agent actions using reinforcement learning.
[Paper] [Code] [Arxiv Preprint]

Neural Shift-Reduce CCG Semantic Parsing (EMNLP 2016)
Dipendra Misra and Yoav Artzi
Abstract: We present the first published shift-reduce parser for CCG semantic parsing. Our novel learning algorithm considers learning without annotated parse trees, with type constraints and complex categories.
[Paper] [Supplementary] [Code]

Environment-driven lexicon induction for high-level instructions (ACL 2015)
Dipendra K. Misra, Kejia Tao, Percy Liang, Ashutosh Saxena
Abstract: Learning new concepts and meaning of new verbs at test time using environment as an implicit signal.
[Paper] [Supplementary] [Code] [Data] [Simulator] [Bibtex]

Robo Brain: Large-Scale Knowledge Engine for Robots (ISRR 15)
Ashutosh Saxena, Ashesh Jain, Ozan Sener, Aditya Jami, Dipendra K. Misra, Hema S Koppula
Abstract: We introduce a knowledge engine, which learns and shares knowledge representations, for robots to carry out a variety of tasks.
[Paper] [Website] [Bibtex]

Tell Me Dave: Context-Sensitive Grounding of Natural Language to Manipulation Instructions (RSS 2014)
Dipendra K. Misra, Jaeyong Sung, Kevin K. Lee, Ashutosh Saxena
Abstract: Grounding language to actions with application on PR2 robot. Learn from users playing with simulations.
[Paper] [Website] [Simulator] [Bibtex]

Journal:

Tell Me Dave: Context-Sensitive Grounding of Natural Language to Manipulation Instructions (IJRR 2015)
Dipendra K. Misra, Jaeyong Sung, Kevin K. Lee, Ashutosh Saxena,
[Paper] [Website] [Bibtex]



Workshop

The Third Workshop on Representation Learning for NLP
(Rep4NLP at ACL 2018)

Isabelle Augenstein, Kris Cao, He He, Felix Hill, Spandana Gella, Jamie Kiros, Hongyuan Mei and Dipendra Misra
[Workshop Proceedings]

Equivalence Between Wasserstein and Value-Aware Model-based Reinforcement Learning (PGMRL at ICML 2018)
Kavosh Asadi, Evan Carter, Dipendra Misra and Michael L. Littman
Abstract: We show that the recently proposed value-aware model based reinforcement learning (Farahmand et al. 2017) can be viewed as minimizing the Wasserstein loss when the value function is Lipschitz continuous.
[ArXiv Preprint]

CHALET: Cornell House Agent Learning Environment (arXiv 2018 report)
Claudia Yan, Dipendra Misra, Andrew Bennett, Aaron Walsman, Yonatan Bisk and Yoav Artzi
Abstract: We introduce a knowledge engine, which learns and shares knowledge representations, for robots to carry out a variety of tasks.
[Paper] [Website] [Bibtex]

Reinforcement Learning for Mapping Instructions to Actions with Reward Learning (NCHRC, AAAI Fall Symposium 2017)
Dipendra Misra and Yoav Artzi
Abstract: Learn a reward function directly on images for mapping natural language instructions to actions.
[Paper] [Code]

Bio and Employment

Posts

  • Mathematical Analysis of Policy Gradient Methods [Post]

  • Tutorial on Markov Decision Process Theory and Reinforcement Learning. Slides: [Part 1] [Part 2] [Post]

Misc

I received my bachelors degree in computer science from the Indian Institute of Technology, Kanpur where my undergrad thesis on learning to solve IQ questions was advised by Amitabha Mukerjee and Sumit Gulwani. My studies at time have been supported by OPJEMS Merit scholarship (2011-12 and 2012-13), Cornell University Fellowship (2013) and amazon AWS Research grant (2016). I try to spend some time playing piano, writing compositions and reading news.