Artificial Intelligence Seminar

Spring 2014
Friday 12:00-1:15
Gates Hall 122

The AI seminar will meet weekly for lectures by graduate students, faculty, and researchers emphasizing work-in-progress and recent results in AI research. Lunch will be served starting at noon, with the talks running between 12:15 and 1:15. The new format is designed to allow AI chit-chat before the talks begin. Also, we're trying to make some of the presentations less formal so that students and faculty will feel comfortable using the seminar to give presentations about work in progress or practice talks for conferences.


February 7th

Speakers: Hema Koppula's A-exam

Time: 10:00am

Location: Gates 114

Title: Understanding People from RGBD Data for Assistive Devices

Abstract: With the advances in 3D sensing technology and their wide availability, we are able to collect enormous amounts of data about people in various modalities such as 2D images, 3D depths and sound. New algorithms that can extract meaningful information from this data would enable new applications for improving aspects of daily living
significantly. Examples, include robotic assistants, health-care monitoring systems, self-driving cars, gaming, and mobile devices.

In this talk, I will present my work on understanding human environments and activities form RGBD data. I will describe our learning algorithms which capture the rich spatio-temporal context from the environment and discuss how our models capture human intentions and the functional representation of the objects. This enables assistive robots to anticipate what humans are going to do in near future and reactively respond with appropriate actions. Finally, I will elaborate on directions for on-going and future work. This
includes how the robot can learn to perform activities and do anticipatory planning for collaborating with humans in complex day-to-day tasks.

Thesis committee: Ashutosh Saxena (chair), Robert Kleinberg, Bart
Selman, Thorsten Joachims, Jitendra Malik.

“The AI-Seminar is sponsored by Yahoo!”

February 14th

Speakers: Jacob Bien, Cornell University

Host: Thorsten Joachims

Title: Bet on hierarchy for a moment: Building structural sparsity into covariance estimation using convex regularization.

Abstract: In many modern datasets, the number of features rivals or even exceeds the number of observations measured.  This so-called "high-dimensional setting" poses serious challenges to our ability to learn from data and demands a rethinking of many classical statistical approaches that were designed in a world in which sample sizes were large and measured variables were few.  The lasso (Tibshirani, 1996) has become a standard tool in the statistical machine learner's toolkit, curbing many of the difficulties of high-dimensional data by encouraging sparsity in the solution through the use of a convex penalty.

In certain situations, we may have special knowledge of the structure of the solution, and this information can be leveraged to build problem-specific convex regularizers that make better use of the small sample sizes available.  After a general introduction to the lasso and hierarchical group lasso, we will consider the problem of high-dimensional covariance estimation, and demonstrate how the hierarchical group lasso can be used to produce an estimator that has attractive properties, both in theory and practice.

“The AI-Seminar is sponsored by Yahoo!”

February 21st

Speakers: Anca Dragan, PhD student, CMU

Host: Ashutosh Saxena

Bio: Anca Dragan is a PhD candidate at Carnegie Mellon's Robotics Institute, and a member of the Personal Robotics Lab. She was born in Romania and received her B.Sc. in Computer Science from Jacobs University Bremen in 2009. Her research lies at the intersection of robotics, machine learning, and human-computer interaction: she is interested in enabling robots to seamlessly work with and around people. Anca is an Intel PhD Fellow for 2013, a Google Anita Borg Scholar for 2012, and serves as General Chair in the Quality of Life Technology Center's student council. 

Title: Robot Motion for Seamless Human-Robot Collaboration

Abstract: In this talk, I will be summarizing our work on enabling robots to produce motion that is suitable for human-robot collaboration and co-existence. Most motion in robotics is purely functional: industrial robots move to package parts, vacuuming robots move to suck dust, and personal robots move to clean up a dirty table. This type of motion is ideal when the robot is performing a task in isolation. Collaboration, however, does not happen in isolation. In collaboration, the robot's motion has an observer, watching and interpreting the motion. In this work, we move beyond functional motion, and introduce the notion of an observer into motion planning, so that robots can generate motion that is mindful of how it will be interpreted by a human collaborator. We formalize predictability and legibility as properties of motion that naturally arise from the inferences that the observer makes, drawing on action interpretation theory in psychology. We propose models for these inferences based on the principle of rational action, and use a combination of constrained trajectory optimization and machine learning techniques to enable robots to plan motion for collaboration.

“The AI-Seminar is sponsored by Yahoo!”

March 7th

Speakers: Rajesh Rao

Host: David Field

Bio: Rajesh Rao is known for his work on human-computer interfaces, social robotics (robots that learn through mimicry) and for work on decoding the Indus Script (see his Ted Talk).

Title: The Bayesian Brain: Towards a Unifying Viewof Perception, Action, and Rewards 

Abstract: A major impediment to understanding brain function at the systems level is the lack of overarching computational theories describing how the brain combines sensory information with prior knowledge and rewards to enact behaviors. In this talk, I will discuss a Bayesian model of perception and action that could serve as a candidate for such a theory. Bayesian inference is used to compute a posterior probability distribution over task variables, and the entire distribution is used to select actions that optimize costs and rewards. The model provides normative explanations for neural and behavioral data from a set of well-known sensory decision making tasks, and suggests specific computational roles for the neocortex and the basal ganglia. 

Time/ Location: 3:30 p.m. Uris Hall 202 ( Refreshments at 3:15)

“The AI-Seminar is sponsored by Yahoo!”

March 14th

Speakers: Tom Howard, MIT

Host: Hadas Kress-Gazit

Bio: Thomas Howard is a Research Scientist in the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology.  Dr. Howard’s research centers on robot intelligence in complex, unstructured environments with a specific focus on motion planning and natural language understanding.  Previously, he was a Research Technologist II at the Jet Propulsion Laboratory and a Lecturer in Mechanical Engineering at the California Institute of Technology.  He earned his Ph.D. in Robotics from Carnegie Mellon University and his B.S. degrees in Mechanical Engineering and Electrical and Computer Engineering from the University of Rochester.

Title: A Probabilistic Model for Inferring Planning Constraints from Natural Language Instructions

Abstract: Contemporary examples of autonomous robots exhibit enough intelligence to drive cars in human environments, manipulate objects on assembly lines, and explore distant planets.  Graphical or physical interfaces to such platforms are however often engineered for specific behaviors in particular domains and are unable to scale to more general tasks and environments.   Recent advancements in speech-based interfaces shows that non-expert operators can exploit the diversity of language to effectively communicate intent to cyber-physical systems in human-robot teams. 
In this talk I will present an efficient technique for natural language understanding of robot instructions that uses probabilistic models to formulate problems that can be solved by contemporary planning algorithms to find admissible trajectories.  Inferring the constraints that define rather than solutions to the planning problem eliminates the need to approximate the continuum of actions at inference time and mitigates the influence of environmental conditions that may be present in the training examples.  Central to this approach is the Distributed Correspondence Graph (DCG) model that efficiently evaluates activation, inversion, or exclusion of constraints in a constraint set by assuming conditional independence across constituents of both the parse tree and the groundings.   The talk will conclude a discussion of applications, limitations, and extensions of the DCG model in the context of natural language understanding of robot instructions. 

“The AI-Seminar is sponsored by Yahoo!”

March 21st

Speakers: Karthik Raman, Cornell University

Title: "By the User, For the User, With the Learning System": Learning From User Interactions

Abstract: Online information systems like search engines and recommender systems have used machine learning algorithms to learn and adapt quickly so as to increase their efficacy. However, to improve fidelity, robustness and cost-effectiveness of these complex systems, we need to look beyond conventional learning techniques which rely on expert-labeled data. In this talk, I will present principled learning algorithms, that learn continuously from and with the users in an interactive manner. I will demonstrate that these algorithms perform well in end-to-end evaluation studies with live users, while also admitting theoretical guarantees. I will first describe how these algorithms can be used to overcome noise and biases present in user feedback. Second, I will show how we can learn the dependencies across different items (e.g. documents of a ranking), by explicitly modeling the joint utility of a set of items as a submodular set function. Third, I will describe how we can learn to reconcile the conflicting preferences of a diverse user population, to obtain socially optimal solutions.

“The AI-Seminar is sponsored by Yahoo!”

March 28th






“The AI-Seminar is sponsored by Yahoo!”

April 4th



April 11th



April 18th

Speakers: Matt Walter, MIT

Host: Ashutosh Saxena

Bio: Matthew Walter is a research scientist in the Computer Science and Artificial Intelligent Laboratory at the Massachusetts Institute of Technology. His research focuses on probabilistic approaches to perception and natural language understanding that make it possible for robots to work effectively alongside humans. Matthew received his
Ph.D. in Mechanical Engineering from the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution.

Title: Learning Semantic Maps from Natural Language Descriptions

Abstract: Whether they are providing personalized care, assisting people with cognitive or physical impairments, or carrying out household chores, robots have the potential to improve our quality of life in revolutionary ways. In order to realize this potential, we must develop robots that people can efficiently command and naturally
interact with. This interaction demands that robots be able to reason over models of their environments as rich as those of their human
partners. However, today's robots understand their environment through representations that are either limited to low-level metric properties
or that require domain experts to hard-code higher-level semantic knowledge.

In this talk, I will describe an algorithm that I developed to enable robots to efficiently learn shared cognitive models of their surroundings from a user's natural language descriptions. The novelty lies in inferring spatial and semantic knowledge from these descriptions and fusing this information with the metric measurements
from the robot's sensor streams. The method maintains a joint distribution over a hybrid metric, topological, and semantic
representation of the environment, which provides a common framework in which to integrate these disparate sources of information. I will
demonstrate that the algorithm allows people to share meaningful, human-centric properties of their environment simply by speaking to
the robot. I will conclude by describing ongoing efforts in human-robot dialog and planning that build upon this semantic mapping algorithm to enable a voice-commandable wheelchair and other robots to follow free-form spoken instructions.

“The AI-Seminar is sponsored by Yahoo!”

April 23rd

(Wednesday @ 1:30pm)

Speakers: Rich Caruana, Microsoft

Host: Thorsten Joachims

Title: Do Deep Nets Really Need to Be Deep?

Abstract: Currently, deep neural networks are the state of the art on problems such as speech recognition and computer vision. In this work we show that shallow feed-forward networks can learn the complex functions previously learned by deep nets and achieve accuracies previously only achievable with deep models. Moreover, the shallow neural nets can learn these deep functions using a total number of parameters similar to the original deep model. We evaluate our method on TIMIT phoneme recognition task and are able to train shallow fully-connected nets that perform similarly to complex, well-engineered, deep convolutional architectures. Our success in training shallow neural nets to mimic deeper models suggests that there probably exist better algorithms for training shallow feed-forward nets than those currently available.

April 25th

Speakers: Silvio Savarese, Stanford

Host: Ashutosh Saxena

Title: Perceiving the 3D world from Images and Videos

Abstract: When we look at an environment such as a coffee shop, we don't just recognize the objects in isolation, but rather perceive a rich scenery of the 3D space, its objects and all the relations among them. This allows us to effortlessly navigate through the environment, or to interact and manipulate objects in the scene with amazing precision.  A major line of work from my group in recent years has been to design intelligent visual models that understand the 3D world by integrating 2D and 3D cues, inspired by what humans do. In this talk I will introduce a novel paradigm whereby objects and 3D space are modeled in a joint fashion to achieve a coherent and rich interpretation of the environment. I will start by giving an overview of our research for detecting objects and determining their geometric properties such as 3D location, pose or shape. Then, I will demonstrate that these detection methods play a critical role for modeling the interplay between objects and space which, in turn, enable simultaneous semantic reasoning and 3D scene reconstruction. I will conclude this talk by demonstrating that our novel paradigm for scene understanding is potentially transformative in application areas such as autonomous or assisted navigation, robotics, augmented reality, automatic 3D modeling of urban environments and surveillance.

“The AI-Seminar is sponsored by Yahoo!”

May 2nd

Speakers: Dipendra Misra & Karthik Raman, Cornell University

Dipendra's Title: Tell Me Dave: Context-Sensitive Grounding of Natural Language to Mobile Manipulation Instructions

Abstract: In this talk, I will present our work on grounding natural language instructions to a  a sequence of mobile manipulation tasks for a given environment. Given a new environment, even a simple task such as boiling water would be performed quite differently depending on the presence, location and state of the objects. Moreover the language may not always fully specify all the steps required for the completion of the task. 

Our work proposes a model that takes into account the variations in natural language, and ambiguities in grounding them to robotic instructions with appropriate environment context and task constraints. Our model also handles incomplete or noisy NL instructions and can even ground verbs unseen in the training corpus. Our model significantly outperforms the state-of-the-art.

Keywords: Grounding language to sequence of instructions, CRF,  handling noise in language.


Karthik's Title: Methods for Ordinal Peer Grading

Abstract: MOOCs have the potential to revolutionize higher education with their wide outreach and accessibility, but they require instructors to come up with scalable alternates to traditional student evaluation. Peer grading -- having students assess each other -- is a promising approach to tackling the problem of evaluation at scale, since the number of "graders" naturally scales with the number of students. However, students are not trained in grading, which means that one cannot expect the same level of grading skills as in traditional settings. Drawing on broad evidence that ordinal feedback is easier to provide and more reliable than cardinal feedback, it is therefore desirable to allow peer graders to make ordinal statements (e.g. "project X is better than project Y") and not require them to make cardinal statements (e.g. "project X is a B-"). In this talk, I will discuss how we can automatically infer student grades from ordinal peer feedback, as opposed to existing methods that require cardinal peer feedback. By formulating the ordinal peer grading problem as a type of rank aggregation problem, we can utilize different probabilistic models under which to estimate student grades and grader reliability. I will discuss the applicability of these methods using peer grading data collected from a real class -- with instructor and TA grades as a baseline -- and demonstrate the efficacy of ordinal feedback techniques in comparison to existing cardinal peer grading methods and traditional evaluation techniques.

“The AI-Seminar is sponsored by Yahoo!”

May 23th

Speakers: Professor Dr. Katharina Morik, TU Dortmund University

Host: Thorsten Joachims

Title: Data Analytics for Sustainability

Abstract: Sustainability has many facets and researchers from many disciplines are working on them. Particularly knowledge discovery always considered sustainability an important topic (e.g., special issue on data mining for sustainability in Data Mining and Knowledge Discovery Journal, March 2012).

  • Environmental tasks include risk analysis concerning floods, earthquakes, fires, and other disasters as well as the ability to react to them in order to guarantee resilience. The climate is certainly of influence and the debate on climate change received quite some attention.
  • Energy efficiency demands energy-aware algorithms, operating systems, green computing. System operations are to be adapted to a predicted user behavior such that the required processing is optimized with respect to minimal energy consumption.
  • Engineering tasks in manufacturing, assembly, material processing, and waste removal or recycling offer opportunities to save resources to a large degree. Adding the prediction precision of learning algorithms to the general knowledge of the engineers allows for surprisingly large savings.

Global reports on the millennium goals and open government data regarding sustainability are publicly available. For the investigation of influence factors, however, data analytics is necessary. Big data challenges the analysis to create data summaries. Moreover, the prediction of states is necessary in order to plan accordingly.

In this talk, two case studies will be presented. Disaster management in case of a flood combines diverse sensor data streams for a better traffic administration. A novel spatio-temporal random field approach is used for smart routing based on traffic predictions.
The other case study is in engineering and saves energy in the steel production based on the multivariate prediction of the processing end-point by the regression support vector machine.

“The AI-Seminar is sponsored by Yahoo!”

July 21st

Speaker: Katja Hofmann, Microsoft

Host: Thorsten Joachims

Title: Fast and Reliable Online Learning to Rank for Information Retrieval

Online learning to rank for information retrieval (IR) holds promise for allowing the development of “self-learning search engines” that can automatically adjust to their users. With the large amount of e.g., click data that can be collected in web search settings, such techniques could enable highly scalable ranking optimization. However, feedback obtained from user interactions is noisy, and developing approaches that can learn from this feedback quickly and reliably is a major challenge.

In this talk I will present my recent work, which addresses the challenges posed by learning from natural user interactions. First, I will detail a new method, called Probabilistic Interleave, for inferring user preferences from users’ clicks on search results. I show that this method allows unbiased and fine-grained ranker comparison using noisy click data, and that this is the first such method that allows the effective reuse of historical data (i.e., collected for previous comparisons) to infer information about new rankers. Second, I show that Probabilistic Interleave enables new online learning to rank approaches that can reuse historical interaction data to speed up learning by several orders of magnitude, especially under high levels of noise in user feedback. I conclude with an outlook on research directions in online learning to rank for IR, that are opened up by our results.

July 25th

Speaker:Aditya Jami

Host: Ashutosh Saxena

Bio: Aditya Jami focuses on different problems associated in developing platforms to apply Machine Learning algorithms on Big Datasets. He received his Masters from Stanford University under the supervision of Prof. Hector Garcia-Molina. He is the recipient of Yahoo's 'You Rock' award in 2010. He is currently serving as CTO of Zoodig that specializes in mining large public social datasets to produce insightful behavioral data of 100's of millions of users. Prior to this, he served as cloud architect at Yahoo and Netflix.
Some of his previous notable work includes building a realtime data platform at Yahoo that collects and processes approximately Trillion web events (10TB) on daily basis. This technology powers the Yahoo's instant content recommendation based on users realtime behavior. At Netflix, he developed a very innovative resiliency method (Chaos Monkey) that ensures fault tolerance on large scale service oriented architecture. It received many accolades among the Cloud Computing community. Some press mentions include Techcrunchslashdot,NextWeb.

Title: The Robo Brain Project:  Scaling the Learning Models

Abstract: For building future robotic applications such as self-driving cars, industrial robots, and household devices and robots, we need to learn from multi-modal data such as images, videos, 3D point-clouds, video game logs and natural language. Cornell Robot learning lab is building models for performing a variety of tasks including 3D scene labeling, human activity detection and anticipation, grasping, path planning, robot language, etc. However, lot of work is not accessible across domains because data and learning models are not integrated into one single knowledge graph. 
The RoboBrain project is solving this problem. In this talk I will address different challenges related to its data modeling, effective linking of concepts at scale and building a realtime querying layer on the top of it. 






See also the AI graduate study brochure.

Please contact any of the faculty below if you'd like to give a talk this semester. We especially encourage graduate students to sign up!

Sponsored by

CS7790, Spring '14
Claire Cardie
Carla Gomes
Joe Halpern
Dan Huttenlocher
Thorsten Joachims
Lillian Lee
Ashutosh Saxena
Bart Selman
Ramin Zabih

Back to CS course websites