Artificial Intelligence Seminar

Fall 2011
Friday 12:00-1:15
Upson 5130

The AI seminar will meet weekly for lectures by graduate students, faculty, and researchers emphasizing work-in-progress and recent results in AI research. Lunch will be served starting at noon, with the talks running between 12:15 and 1:15. The new format is designed to allow AI chit-chat before the talks begin. Also, we're trying to make some of the presentations less formal so that students and faculty will feel comfortable using the seminar to give presentations about work in progress or practice talks for conferences.
September 2nd

Speaker: Jeff Clune, Cornell University

Host: Ashutosh Saxena

Bio:Jeff Clune is a Postdoctoral Fellow in Hod Lipson's lab at Cornell University, funded by a Postdoctoral Research Fellowship in Biology from the National Science Foundation. He studies generative encodings, which enhance evolutionary algorithms by augmenting them with concepts from developmental biology. These concepts enable the assembly of complex neural networks and other forms from compact genomes. One of Jeff's current projects is, a website for designing and printing 3D objects with such encodings. Jeff also develops evolutionary algorithms to investigate open questions in evolutionary biology, and has published work on the evolution of altruism, phenotypic plasticity, and evolvability. Jeff is the co-chair of the Generative and Developmental Systems track at GECCO, the Genetic and Evolutionary Computation Conference (2010-2011). He has a Ph.D. in computer science from Michigan State University, a master's degree in philosophy from Michigan State University, and a bachelor's degree in philosophy from the University of Michigan.  Articles about his research have appeared in news publications such as, The New Scientist, The Daily Telegraph, Slashdot, MIT's Technology Review, and U.S. News & World Report.

Title : Evolving complex, regular neural networks with generative encodings inspired by biological development

Abstract : I will describe my work with an algorithm that evolves artificial neural networks (ANNs) using concepts from developmental biology, a key innovation in the quest to evolve artificially intelligent robots that rival their natural counterparts. I will show that this algorithm, called HyperNEAT, can produce ANNs that exhibit desirable properties of biological brains, such as symmetries and repeated neural motifs. Moreover, these structurally organized ANNs can exploit the regularity of problems, and increasingly outcompete direct encoding controls as problem-regularity increases. I will also introduce an improvement over HyperNEAT that enables the production of slightly irregular patterns when necessary. Finally, I will discuss the importance of making robot learning algorithms geometrically aware, which improves performance and allows experts to inject domain knowledge. 

“The AI-Seminar is sponsored by Yahoo!”

September 9th

Speaker: Erez Lieberman Aiden, Google

Host: Paul Ginsparg

Bio:Erez Lieberman Aiden is a fellow at the Harvard Society of Fellows and
Visiting Faculty at Google. His work integrates mathematical and
physical theory with the invention of new technologies.

He recently invented a method for three-dimensional genome sequencing; he subsequently led the team that, in 2009, reported the first three
dimensional map of the human genome. Together with collaborator Jean-Baptiste Michel, he developed culturomics, a quantitative approach to the study of history and culture that relies on
computational analysis of a significant fraction of the historical record. This work led to the creation of the Google Ngram Viewer, a supplemental website that was visited over a million times in the 24 hours after its launch.
Erez's research has won numerous awards, including the 2010 Hertz Thesis Prize; recognition for one of the top 20 "Biotech Breakthroughs
that will Change Medicine", by Popular Mechanics; the Lemelson-MIT prize for the best student inventor at MIT; the American Physical > Society's Award for the Best Doctoral Dissertation in Biological Physics; and membership in Technology Review's 2009 TR35, recognizing the top 35 innovators under 35. His last three papers have all appeared on the cover of Nature and Science. His work has also been featured on the front page of the New York Times, the Boston Globe, and the Wall Street Journal.

Title : Quantitative Analysis of Culture Using Millions of Digitized Books

Abstract : We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. I survey the vast terrain of "culturomics", focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. I show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. "Culturomics" extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.

“The AI-Seminar is sponsored by Yahoo!”

September 16th

Speaker: Ping Li, Cornell University

Host:Thorsten Joachims

Title: Fitting Logistic Regression and SVM on 200 GB data with One Billion Dimensions

Abstract : This talk will present our recent work on probabilistic hashing algorithms for very large-scale learning. We will focus on two datasets: a small dataset of 24GB (in 16 million dimensions) and a larger dataset of 200 GB (in 1 billion dimensions). Using a single desktop, fitting (regularized) SVM takes about 3 seconds on the small dataset and about 30 seconds on the larger dataset. Interestingly, our technique is purely based on a statistical/probabilistic method named b-bit minwise hashing (Li and Konig, Communications of the ACM, Research Highlight in August 2011). These days, the random projection method has been extremely popular for dimension reduction and compressed sensing. However, we demonstrate that b-bit minwise hashing can be substantially more accurate than random projection.

“The AI-Seminar is sponsored by Yahoo!”

September 23rd

Speaker: Abhishek Anand & Hema Koppula, Cornell University

Host:Ashutosh Saxena

Title : Semantic Labeling of 3D Point Clouds for Indoor Scenes

Abstract :   Inexpensive RGB-D cameras that give an RGB image together with depth data have become widely available (eg. kinect).  We use this data to build 3D point clouds of a full scene.  In this work, we address the task of labeling objects in this 3D point cloud of a complete indoor scene such as an office. We propose a graphical model that captures various features and contextual relations, including the local visual appearance and shape cues, object co-occurence relationships and geometric relationships. With a large number of object classes and relations, the model’s parsimony becomes important and we address that by using multiple types of edge potentials. The model admits efficient approximate inference, and we train it using a maximum-margin learning approach. In our experiments over a total of 52 3D scenes of homes and offices (composed from about 550 views, having 2495 segments labeled with 27 object classes), we get a performance of 84.06% in labeling 17 object classes for offices, and 73.38% in labeling 17 object classes for home scenes. Finally, we applied these algorithms successfully on a mobile robot for the task of finding multiple objects in cluttered rooms.

"The AI-Seminar is sponsored by Yahoo!"

September 30th

Speaker: Flavio Chierichetti

Host: Jon Kleinberg

Title: Reconstructing Patterns of Information Diffusion from Incomplete Observations

Abstract : Motivated by the spread of on-line information in general and on-line petitions in particular, recent research has raised the following
combinatorial estimation problem. There is a tree $T$ that we cannot observe directly (representing the
structure along which the information has spread), and certain nodes randomly decide to make their copy of the information public.
In the case of a petition, the list of names on each public copy of the petition also reveals a path leading back to the root of the tree.
What can we conclude about the properties of the tree we observe from these revealed paths, and can we use the structure of the observed
tree to estimate the size of the full unobserved tree $T$?
Here we provide the first algorithm for this size estimation task, together with provable guarantees on its performance.
We also establish structural properties of the observed tree, providing the first rigorous explanation for some of the unusual
structural signatures present in the spread of real chain-letter petitions on the Internet.  

Joint work with Jon Kleinberg and David Liben-Nowell

“The AI-Seminar is sponsored by Yahoo!”

October 7th




Abstract :

“The AI-Seminar is sponsored by Yahoo!”

October 14th

Speaker: Karthik Raman

Host: Thorsten Joachims

Bio :

Title :  "Structured Learning of Two-Level Dynamic Rankings"

Abstract :  For ambiguous queries, conventional search systems are bound by two conflicting goals: On the one hand, they should diversify and strive to present results for as many query intents as possible. On the other hand, they should provide depth for all intents by displaying more than a single result. However, both diversity and depth cannot be achieved simultaneously in the current "static" systems. Hence we propose a new "dynamic" ranking approach. Dynamic ranking models allow users to personalize the ranking via interaction, thus overcoming the constraints of presenting a one-size-fits-all static ranking. In particular, we propose a new two-level dynamic ranking model for presenting search results to the user. In this model, a user's interactions with the first-level ranking are used to infer this user's intent, so that second-level rankings can be inserted to provide more results relevant for this intent. Representing the utility of a ranking as a submodular function allows us to provide an algorithm to efficiently compute dynamic rankings with provable approximation guarantees for a large family of performance measures. We also propose the first principled algorithm for learning dynamic ranking functions from training data. In addition to the theoretical results, we provide empirical evidence demonstrating the gains in retrieval quality that our method achieves over conventional approaches.

“The AI-Seminar is sponsored by Yahoo!”

October 21st

Speaker: Dr. Devi Parikh

Host:Amir Sadovnik

Bio: Devi Parikh is a Research Assistant Professor at TTI-Chicago, an academic computer science institute affiliated with University of Chicago. She received her M.S. and Ph.D. degrees from the Electrical and Computer Engineering department at Carnegie Mellon University in 2007 and 2009 respectively. She was advised by Tsuhan Chen (now at Cornell). She was a recipient of the National Science Foundation Graduate Research Fellowship. She received her B.S. in Electrical and Computer Engineering from Rowan University in 2005. Her research interests include computer vision and AI in general AI in general. Recently, she has been especially involved in leveraging human-machine collaborations for building smarter machines. More information about her research is available here:

Title : Human-Debugging of Machine Visual Recognition

Abstract : The problem of visual recognition is central towards the goal of automatic image understanding. While a wide range of efforts have been made in the computer vision community addressing different aspects of various recognition problems, machine performance remains unsatisfactory. Fortunately, we have access to a working system whose performance we wish to replicate - the human visual recognition system! It only seems natural to leverage it towards the goal of reliable machine visual recognition. 

In this talk, I will give an overview of our recently-introduced "human-debugging" paradigm. It involves replacing various components of a machine vision pipeline with human subjects, and examining the resultant effect on recognition performance. Meaningful comparisons identify the aspects of machine vision approaches that require future research efforts. I will present several of our efforts within this framework that address image classification (CVPR'10, ICCV'11), object recognition (CVPR'08, PAMI'11, ICCV'11) and person detection (CVPR'11). Besides computer vision, human-debugging is also broadly applicable to other areas in AI such as speech recognition and machine translation.

For image classification, I will describe our work on evaluating the relative importance of image representation, learning algorithms and amounts of training data. We found image representation to be the most important factor. We further evaluated the relative importance of local and global information in images, and found that further advancements in modeling global information in images is crucial. For object recognition, we studied the roles of appearance and contextual information for machine and human recognition. Inspired by our findings, we proposed a novel contextual cue that exploits unlabeled regions in images, which are often ignored by existing contextual models. Our proposed cue significantly boosts performance of a slew of existing object detectors. Finally, for person detection we analyzed a state-of-art parts-based person detection model and found part-detection to be the weakest link.

“The AI-Seminar is sponsored by Yahoo!”

October 28th



Bio :




“The AI-Seminar is sponsored by Yahoo!”

November 4th

Speaker: Pannaga Shivaswamy, Cornell University

Host:Thorsten Joachims

Title : Online Learning with Preference Feedback

Abstract : We propose a new online learning model for learning with preference feedback. The model is especially suited for applications like web search and recommender systems, where preference data is readily available from implicit user feedback (e.g. clicks). In particular, at each time step a potentially structured object (e.g. a ranking) is presented to the user in response to a context (e.g. query), providing him or her with some unobserved amount of utility. As feedback, the algorithm receives  an improved object that would have provided higher utility. Our model, where the ordering of two arms is revealed (the one we presented and the one we receive as feedback), sits between the expert and the bandit setting. We propose  learning algorithms with provable regret bounds for this online learning setting and demonstrate their effectiveness on a web-search and on a movie recommendation application. The new learning model also applies to many other interactive learning problems and admits several interesting extensions.
(Joint work with Thorsten Joachims.)

“The AI-Seminar is sponsored by Yahoo!”

November 11th



Speaker: Vasumathi Raman, Cornell University

Host:Hadas Kress-Gazit

Title : Challenges in the Synthesis of High-Level Robot Behaviours

Abstract : A key challenge in robotics is the generation of controllers for autonomous, high-level robot behaviors comprising a non-trivial sequence of actions. Recently, Linear Temporal Logic synthesis has emerged as a powerful tool for automatically generating autonomous robot hybrid controllers that guarantee desired behaviors expressed by a class of temporal logic specifications. However, there are still several challenges to be met when using LTL synthesis for robot control. When there is no controller that fulfills a given specification, the standard approaches do not provide the user with a source of failure, making the troubleshooting of specifications an unstructured and time-consuming process. Furthermore, additional care is required when creating correct-by-construction controllers for robots with actions of varying execution durations, to ensure safety of continuous execution. 

I will present some results on automating the analysis of unsynthesizable specifications in order to identify sources of failure. I will also describe an approach for generating hybrid controllers for actions of two execution speeds with guaranteed safety of continuous execution, and discuss its extension to the general case. Both the above approaches are implemented within the LTLMoP toolkit for robot mission planning.

Joint work with Hadas Kress-Gazit and Cameron Finucane

“The AI-Seminar is sponsored by Yahoo!”

November 18th

Speaker: Daniel Sheldon, Oregon State University

Host: Carla Gomes

Time/Place: ****2:00pm, Upson 5130***

Title: Collective Graphical Models

Abstract: There are many settings in which we wish to fit a model of the behavior of individuals but where our data consist only of aggregate information (counts or low-dimensional contingency tables). In this talk, I will introduce Collective Graphical Models (CGMs), a framework for modeling and probabilistic inference that operates directly on the sufficient statistics of the individual model. CGMs are motivated by the goal of modeling bird migration where we want to fit models for the migratory behavior of individual birds, but we observe only population-level surveys conducted over time. I will focus primarily on this special case, where the main concepts of CGMs can be understood intuitively in terms of network flows. I will show how to derive a highly efficient Gibbs sampling algorithm for inference in CGMs and present experiments that demonstrate its effectiveness. In particular, I will give empirical evidence that the running time to solve an important inference task using our method does not depend on the population size; prior to this work, the only existing algorithm for the same task took time exponential in the population size.
Joint work with Tom Dietterich, to appear at NIPS 2011. 
Bio: Daniel Sheldon is a postdoctoral fellow in the School of EECS at Oregon State University, where he holds an NSF fellowship in Bioinformatics. His primary research interests are machine learning and probabilistic modeling applied to large-scale problems in ecology and computational sustainability. Other research interests include web search and reputation systems, optimization, statistics, and networks. He completed his Ph.D. in computer science at Cornell University in 2009. Prior to that, he received an A.B. in mathematics from Dartmouth College in 1999, and worked at Akamai Technologies and then DataPower Technology between 1999 and 2004.

“The AI-Seminar is sponsored by Yahoo! and the Institute for Computational Sustainability”


November 25th

Speaker: NO SEMINAR- Thanksgiving Break


Bio :


Abstract :

“The AI-Seminar is sponsored by Yahoo!”

December 2nd

Speaker: Alon Keinan, Cornell University

Host: Joe Halpern

Bio: Alon Keinan is the Robert N. Noyce Assistant Professor in Life Science and Technology at Cornell University. He received his Ph.D. with distinction in Computer Science from Tel Aviv University in 2005. He continued to a postdoctoral position in the Department of Genetics at Harvard Medical School and in the Broad Institute of MIT and Harvard, where he transitioned into the fields of human population genomics and medical genetics. His research program focuses on elucidating the history of modern human populations and on developing computational and statistical methods for searching for genes important in human biology. His research interests include population genomics, statistical genetics, computational modeling, molecular evolution, and evolutionary dynamics. He was the recipient of an Alfred P. Sloan Research Fellowship, a Rothschild Postdoctoral Fellowship, and a Wolf award for Ph.D. students. Articles about Alon’s recent research have appeared in Science, Discover Magazine, New Scientist, and The Scientist.

Title : Evolutionary history of modern humans: A genomic perspective

Abstract : One quantity of crucial importance for understanding evolutionary history is the allele frequency spectrum. I will explore its mathematical properties and will report data mined from the HapMap Project that overcome ascertainment biases and make it possible to obtain accurate estimates. Analysis of these data shows that the ancestors of East Asians and North Europeans shared the same population bottleneck dispersing out of Africa but that both also experienced a more recent bottleneck. Contrasting chromosome X with the rest of the genome further shows that around the time of the dispersal out of Africa chromosome X experienced a more extreme reduction in population size, which points to a sex-biased demographic event during that epoch of human genetic history.

“The AI-Seminar is sponsored by Yahoo!”

December 6th

Speaker: Rajeev Rastogi, Vice President of Yahoo

Host: Ashutosh Saxena

Time/Place: ****3:00pm, TUESDAY, 5130 Upson***

Bio: Rajeev Rastogi is the Vice President of Yahoo! Labs Bangalore where he directs basic and applied research in the areas of web search, advertizing, and cloud computing. Previously Rajeev was at Bell Labs where he was a Bell Labs Fellow and the founding Director of the Bell Labs Research Center in Bangalore. Rajeev is active in the fields of databases, data mining, and networking, and has served on the program committees of several conferences in these areas. He currently serves on the editorial board of the CACM, and has been an Associate editor for IEEE Transactions on Knowledge and Data Engineering in the past. He has published over 125 papers, and filed over
50 patents. Rajeev received his B. Tech degree from IIT Bombay, and a PhD degree in Computer Science from the University of Texas, Austin.

Title : Building knowledge bases from the web

Abstract: The web is a vast repository of human knowledge. Extracting structured data from web pages can enable applications like comparison shopping, and lead to improved ranking and rendering of search results. In this talk, I will describe two efforts at Yahoo! Labs to extract records from pages at web scale. The first is a wrapper induction system that handles end-to-end extraction tasks from clustering web pages to learning XPath extraction rules to relearning rules when sites change. The system has been deployed in production within Yahoo! to extract more than 200 million records from
~200 web sites. The second effort exploits content redundancy on the web to automatically extract records without human supervision. Starting with a seed database, we determine values in the pages of each new site that match attribute values in the seed records. We devise a new notion of similarity for matching templatized attribute content, and an apriori style algorithm that exploits templatized page structure to prune spurious attribute matches.

“The AI-Seminar is sponsored by Yahoo!”

December 9th




Title :


“The AI-Seminar is sponsored by Yahoo!”



See also the AI graduate study brochure.

Please contact any of the faculty below if you'd like to give a talk this semester. We especially encourage graduate students to sign up! 

Sponsored by

CS7790, Fall '11
Claire Cardie
Carla Gomes
Joe Halpern
Dan Huttenlocher
Thorsten Joachims
Lillian Lee
Ashutosh Saxena
Bart Selman
Ramin Zabih

Back to CS course websites