Artificial Intelligence Seminar

Fall 2016
Friday 12:00-1:15
Gates Hall 122


The AI seminar will meet weekly for lectures by graduate students, faculty, and researchers emphasizing work-in-progress and recent results in AI research. Lunch will be served starting at noon, with the talks running between 12:15 and 1:15. The new format is designed to allow AI chit-chat before the talks begin. Also, we're trying to make some of the presentations less formal so that students and faculty will feel comfortable using the seminar to give presentations about work in progress or practice talks for conferences.

If you or others would like to be deleted or added from this announcement, please contact Jessie White at


September 9th

Speaker: Chris Callison-Burch , UPenn

Host: Lillian Lee

Title: Large-scale paraphrasing for natural language generation

Abstract: I will present my method for learning paraphrases - pairs of English expressions with equivalent meaning - from bilingual parallel corpora, which are more commonly used to train statistical machine translation systems.  My method equates pairs of English phrases like <thrown into jail, imprisoned> when they share an aligned foreign phrase like festgenommen. Because bitexts are large and because a phrase can be aligned many different foreign phrases (including phrases in multiple foreign languages), the method extracts a diverse set of paraphrases.  For thrown into jail, we not only learn imprisoned, but also arrested, detained, incarcerated, jailed, locked up, taken into custody, and thrown into prison, along with a set of incorrect/noisy paraphrases.  I'll show a number of methods for filtering out the poor paraphrases, by defining a paraphrase probability calculated from translation model probabilities, and by re-ranking the candidate paraphrases using monolingual distributional similarity measures.

In addition to lexical and phrasal paraphrases, I'll show how the bilingual pivoting method can be extended to learn meaning-preserving syntactic transformations like the English possessive rule or dative shift.  I'll describe a way of using synchronous context free grammars (SCGFs) to represent these rules.  This formalism allows us to re-use much of the machinery from statistical machine translation to perform sentential paraphrasing.  We can adapt our "paraphrase grammars" to do monolingual text-to-text generation tasks like sentence compression or simplification. 

I'll also briefly sketch future directions for adding a semantics to the paraphrases, which my lab is undertaking for the DARPA DEFT program.

Bio: I am the Aravind K Joshi term assistant professor of Computer and Information Science Department at the University of Pennsylvania.  Before joining Penn, I was a research faculty member  at the Center for Language and Speech Processing at Johns Hopkins University for 6 years.  I was the Chair of the Executive Board of the North American chapter of the Association for Computational Linguistics (NAACL) from 2011-2013. I served as the Program Co-Chair for EMNLP 2015, and will be the General Chair of ACL 2017.  I have served on the editorial boards of the journals Transactions of the ACL (TACL) and Computational Linguistics.  I have more than 80 publications, which have been cited nearly 10,000 times.  I am a Sloan Research Fellow, and I have received faculty research awards from Google, Microsoft and Facebook in addition to funding from DARPA and the NSF.

“The AI-Seminar is sponsored by Yahoo!”

September 16th

Speaker: Pantelis Pipergias Analytis, Cornell

Host: Thorsten Joachims

Title: "Human behavior in contextual multi-armed bandit problems"

Abstract: In real-life decision environments people learn from their direct experience with alternative courses of action. Yet, they can accelerate their learning by using functional knowledge about the features characterizing the alternatives. We designed a series of novel contextual multi-armed experiments where decision makers chose repeatedly between multiple alternatives characterized by two informative features. We compared human behavior in these contextual tasks with behavior in classic multi-armed bandit tasks without feature information. Behavioral analysis showed that participants in the contextual bandit tasks use the feature information to direct their exploration of promising alternatives. Ex post, we tested participants’ acquired functional knowledge in one-shot multi-feature choice trilemmas. We find that inter-individual differences are large and that the participants who learn the function better, as judged by performance in the one-shot choice task, perform much better overall. Finally, we employed the gaussian processes framework as well as classical reinforcement learning models to describe how people acquired functional knowledge and allocated their choices. The gaussian processes models accounted for the behavior of a sizable subset of participants, yet the behavior of most is still described better by classical reinforcement learning models.

This is joint work with Hrvoje Stojic (UPF), Maarten Speekenbrink (UCL) and Peter Dayan (UCL). 

“The AI-Seminar is sponsored by Yahoo!”

September 23rd

Speaker: Kristin Petersen

Host: Ross Knepper

Title: Designing Robot Collectives

Abstract: In robot collectives, interactions between large numbers of individually simple robots lead to complex global behaviors. A great source of inspiration is social insects such as ants and bees, where thousands of individuals coordinate to handle advanced tasks like food supply and nest construction in a remarkably scalable and error tolerant manner. Likewise, robot swarms have the ability to address tasks beyond the reach of single robots, and promise more efficient parallel operation and greater robustness due to redundancy. Key challenges involve both control and physical implementation. In this seminar I will discuss an approach to such systems relying on embodied intelligent robots designed as an integral part of their environment, where passive mechanical features replace the need for complicated sensors and control.

The majority of my talk will focus on a team of robots for autonomous construction of user-specified three-dimensional structures developed during my thesis. Additionally, I will give a brief overview of my research on the Namibian mound-building termites that inspired the robots. Finally, I will talk about my recent research thrust, enabling stand-alone centimeter-scale soft robots to eventually be used in swarm robotics as well. My work advances the aim of collective robotic systems that achieve human-specified goals, using biologically-inspired principles for robustness and scalability.

“The AI-Seminar is sponsored by Yahoo!”

September 30th

Speaker:Yejin Choi, University of Washington

Host: Claire Cardie

Title: Procedural Language and Knowledge

Abstract: Various types of how-to-knowledge are encoded in natural language instructions: from setting up a tent, to preparing a dish for dinner, and to executing biology lab experiments. These types of instructions are based on procedural language, which poses unique challenges. For example, verbal arguments are commonly elided when they can be inferred from context, e.g., ``bake for 30 minutes'', not specifying bake what and where. Entities frequently merge and split, e.g., ``vinegar’’ and ``oil’’ merging into ``dressing’’, creating challenges to reference resolution. And disambiguation often requires world knowledge, e.g., the implicit location argument of ``stir frying'' is on ``stove''. In this talk, I will present our recent approaches to interpreting and composing cooking recipes that aim to address these challenges. 

In the first part of the talk, I will present an unsupervised approach to interpreting recipes as action graphs, which define what actions should be performed on which objects and in what order. Our work demonstrates that it is possible to recover action graphs without having access to gold labels, virtual environments or simulations. The key insight is to rely on the redundancy across different variations of similar instructions that provides the learning bias to infer various types of background knowledge, such as the typical sequence of actions applied to an ingredient, or how a combination of ingredients (e.g., ``flour'', ``milk'', ``eggs'') becomes a new entity (e.g, ``wet mixture'').

In the second part of the talk, I will present an approach to composing new recipes given a target dish name and a set of ingredients. The key challenge is to maintain global coherence while generating a goal-oriented text. We propose a Neural Checklist Model that attains global coherence by storing and updating a checklist of the agenda (e.g., an ingredient list) with paired attention mechanisms for tracking what has been already mentioned and what needs to be yet introduced. This model also achieves strong performance on dialogue system response generation. I will conclude the talk by discussing the challenges in modeling procedural language and acquiring the necessary background knowledge, pointing to avenues for future research. 

Yejin Choi is an assistant professor at the Computer Science & Engineering Department of University of Washington. Her recent research focuses on language grounding, integrating language and vision, and modeling nonliteral meaning in text. She was among the IEEE’s AI Top 10 to Watch in 2015 and a co-recipient of the Marr Prize at ICCV 2013. Her work on detecting deceptive reviews, predicting the literary success, and learning to interpret connotation has been featured by numerous media outlets including NBC News for New York, NPR Radio, New York Times, and Bloomberg Business Week. She received her Ph.D. in Computer Science at Cornell University.

“The AI-Seminar is sponsored by Yahoo!”

October 7th


Speaker: Chien-Ju Ho, Cornell University

Host: Arpita Ghosh

Title: Eliciting and aggregating data from the crowd

Abstract: Crowdsourcing has gained popularity in machine learning applications as a tool for inexpensively collecting data. A common practice is to assume crowd-contributed data is i.i.d. drawn from some distribution and apply machine learning algorithms to aggregate the data. However, crowd workers are humans and might have constraints or additional abilities in providing data. In this talk, I'll discuss two of my projects that consider human factors in eliciting and aggregating data from the crowd.

In the first part of the talk, I will discuss the problem of how to purchase data held by strategic agents for machine learning tasks. Agents are only willing to share their data if we offer prices higher than their costs, which are unknown to us in advance. We study a model in which agents cannot fabricate data, but may lie about their cost of furnishing their data. The challenge is to use past data to actively price future data in order to obtain learning guarantees, even when agents’ costs can depend arbitrarily on the data itself. We show how to convert a large class of no-regret algorithms into online posted-price and learning mechanisms. We provide theoretical guarantees for our approach. In the second part of the talk, I'll discuss the setting that focuses on collecting labels for classification tasks. Agents can not only provide noisy labels but specify how confident they are about their labels. We propose a Bayesian framework to model the process of the elicitation and aggregation. Furthermore, we explore how to aggregate data collected from arbitrary interfaces (i.e., partitions of agent confidences) and how to design optimal interfaces for eliciting data for aggregation.

This talk covers joint work with Jacob Abernethy, Yiling Chen, Rafael Frongillo, and Bo Waggoner.

“The AI-Seminar is sponsored by Yahoo!”

October 14th

Speaker: Dylan Foster, Cornell

Host: Ross Knepper

Title: Adaptive Online Learning

Abstract: There is a chasm between theory and practice in machine learning. Theoretically derived algorithms are often empirically outperformed by heuristics engineered for specific problems. One reason for this chasm is that theoretically derived algorithms tend to be pessimistic and follow from worst-case analysis. In the online learning community there has been a plethora of work on "adaptive" learning algorithms that obtain the best of both worlds: 1) Robustness against worst case instances and 2) improved performance on easy instances, comparable to that of tailor-made algorithms. However there has been little in the way of general design principles or analysis techniques for such algorithms, and furthermore we do not know the fundamental limits on what can be gained by adaptation.

In this work we attempt to develop a unifying theory for adaptive online learning which in a sense is a first step to developing a VC or PAC style theory for adaptive online learning. We propose a general framework for studying adaptive regret bounds, which are the standard measure of performance in online learning. Given an adaptive data- or model-dependent regret bound we ask, "Does there exist some algorithm achieving this bound?". We show that modifications to recently introduced sequential complexity measures can be used to answer this question by providing sufficient conditions under which adaptive rates can be achieved. In particular each adaptive rate induces a set of so-called offset complexity measures, and obtaining small upper bounds on these quantities is sufficient to demonstrate achievability in an information-theoretic sense.

Our framework recovers and improves a wide variety of adaptive regret bounds including quantile bounds, second order data-dependent bounds, and small loss bounds. In addition we derive a new type of adaptive bound for online linear optimization based on the spectral norm, as well as a new online PAC-Bayes theorem that holds for countably infinite sets. Our analysis reveals a common thread tying together a large body of results on adaptive online learning: For many settings the extent to which we can adapt is essentially determined by the tail behavior of a familiar random process.

Joint work with Alexander Rakhlin and Karthik Sridharan.

“The AI-Seminar is sponsored by Yahoo!”

October 21st

Speaker: Laurens van der Maaten, Facebook AI Research

Host: Kilian Weinberger

Title: Learning to Solve Vision without Annotating Millions of Images

Abstract: The state-of-the-art performance on many computer-vision problems is currently held by systems trained on visual features that are produced by convolutional networks. These convolutional networks are trained on large supervised datasets that contain millions of manually annotated objects. Further improvements of these visual features will likely require even larger manually labeled data sets, which severely limits the pace at which progress can be made. In this talk, I will present work that explores the potential of leveraging massive, weakly-labeled image collections for learning good visual features. I present experiments in which we train convolutional networks on a dataset of 100 million Flickr photos and captions without any full supervision, and show that these networks produce features that perform well in a range of vision problems. Moreover, the experiments show that as a by-product of the learning, the networks appropriately capture word similarity and they can learn correspondences between words in different languages.

This talk presents joint work with Armand Joulin, Allan Jabri, and Nicolas Vasilache.

“The AI-Seminar is sponsored by Yahoo!”

October 28th

Speaker: Tom Ristenpart, CU Tech

Host: Ross Knepper

Title: Stealing Machine Learning Models and Using Them to Violate Privacy

Abstract: Machine learning models are increasingly being trained on sensitive data and deployed widely. This means that attackers can obtain access to the models directly or indirectly via APIs that allow querying feature vectors to receive back predictions based on the model. I will present our recent work exploring confidentiality and privacy threats in this context.

First, I will discuss our recent work showing how to steal machine learning models from prediction APIs. We show algorithms for extracting replicas of target linear models, neural networks, and decision trees from production machine learning cloud services such as Amazon Prediction API and BigML. I'll then explore one specific privacy threat that arises from access to models, what we call model inversion. In the model inversion attacks we develop, an adversary can exploit access to a model and some partial information about a person to improve their ability to guess sensitive information about that person, such as their genotype, an image of their face, and private lifestyle behaviors.
This talk will cover joint work with Matthew Fredrikson, Ari Juels, Eric Lantz, Somesh Jha, Simon Lin, David Page, Michael K. Reiter, Florian Tramer, and Fan Zhang.

“The AI-Seminar is sponsored by Yahoo!”

November 4th

Speaker: Tor Lattimore, Indiana University of Bloomington

Host: Thorsten Joachims

Title: The End of Optimism? An Asymptotically Optimal Algorithm for Stochastic Linear Bandits

Abstract: Multi-armed bandits model a wide range of sequential decision problems involving uncertainty. The setup is enormously popular, presumably due to a combination of its simplicity and a variety of applications such as A/B testing, ad placement and recommender systems. I will talk about a surprising result showing that algorithms based on "optimism in the face of uncertainty" and "posterior sampling" can be arbitrarily far from optimal in the fundamental case that the rewards have a linear structure. This is a disturbing result because algorithms based on these ideas are widely used in practice and few alternative tools have been explored. Besides the negative results I will discuss some candidates for new algorithms and tell you about a myriad of related open problems.

“The AI-Seminar is sponsored by Yahoo!”

November 11th

Speaker: Moontae Lee, Cornell

Host: David Mimno

Title: Beyond Exchangeability: The Chinese Voting Process

Abstract: Many online communities present user\hyp{}contributed responses, such as reviews of products and answers to questions. User-provided helpfulness votes can highlight the most useful responses, but voting is a social process that can gain momentum based on the popularity of responses and the polarity of existing votes. We propose the Chinese Voting Process (CVP) which models the evolution of helpfulness votes as a self-reinforcing process dependent on position and presentation biases. We evaluate this model on Amazon product reviews and more than 80 StackExchange forums, measuring the intrinsic quality of individual responses and behavioral coefficients of different communities.

“The AI-Seminar is sponsored by Yahoo!”

November 18th


November 25th


December 2nd NO SEMINAR

See also the AI graduate study brochure.

Please contact any of the faculty below if you'd like to give a talk this semester. We especially encourage graduate students to sign up!

Sponsored by

CS7790, Fall '15


Back to CS course websites