- About
- Events
- Calendar
- Graduation Information
- Cornell Learning Machines Seminar
- Student Colloquium
- BOOM
- Fall 2024 Colloquium
- Conway-Walker Lecture Series
- Salton 2024 Lecture Series
- Seminars / Lectures
- Big Red Hacks
- Cornell University - High School Programming Contests 2024
- Game Design Initiative
- CSMore: The Rising Sophomore Summer Program in Computer Science
- Explore CS Research
- ACSU Research Night
- Cornell Junior Theorists' Workshop 2024
- People
- Courses
- Research
- Undergraduate
- M Eng
- MS
- PhD
- Admissions
- Current Students
- Computer Science Graduate Office Hours
- Advising Guide for Research Students
- Business Card Policy
- Cornell Tech
- Curricular Practical Training
- A & B Exam Scheduling Guidelines
- Fellowship Opportunities
- Field of Computer Science Ph.D. Student Handbook
- Graduate TA Handbook
- Field A Exam Summary Form
- Graduate School Forms
- Instructor / TA Application
- Ph.D. Requirements
- Ph.D. Student Financial Support
- Special Committee Selection
- Travel Funding Opportunities
- Travel Reimbursement Guide
- The Outside Minor Requirement
- Diversity and Inclusion
- Graduation Information
- CS Graduate Minor
- Outreach Opportunities
- Parental Accommodation Policy
- Special Masters
- Student Spotlights
- Contact PhD Office
Grounded Language Understanding with Realistic Agents
Abstract: Natural language understanding in grounded interactive scenarios is tightly coupled with the actions the system generates and its observations of the environment. The system actions, or its interface, define the output space, while sensory observations ground instruction meaning. How we define the output space and the type of the environment the agent observes determines the complexity of the problem, and the type of reasoning required. While mapping instructions to actions has been studied extensively, the majoirty of work focused on simple discrete actions and was developed in lab environments. Outside of the lab and with real robotic agents, new questions of scalability arise, including how can we use demonstrations to learn to bridge the gap between the high-level concepts of language and low-level robot controls? How do we design models that continuously observe and control? And outside the lab, how can we reason about complex real-life observations? In this talk I will present our recent work on studying grounded language understanding in realistic scenarios. First, I will describe our approach to learning to map instructions and observations to continuous control in a realistic quadcopter drone. Second, I will briefly present our recent study of instructional and spatial language with real-life observations using Google StreetView. Both parts of the talk use new publicly available evaluation benchmarks.