Date

Title/Speaker/Abstract/Host

August 29 
Ariel Procaccia, Hebrew University of Jerusalem
Host: Joe Halpern
Approximating Lewis Carroll's Voting Rule
Charles Dodgson (better known by his pen name, Lewis Carroll) suggested an appealing voting system in 1876. Unfortunately, at the time Dodgson did not take into account such futuristic considerations as computational complexity, and, as it turned out more than a century later, computing the Dodgson score of a candidate is NPhard. I will present two very different approximation algorithms, as well as inapproximability results, for the Dodgson score and a related problem, the Young score. I will also try to position this work in the context of a wider agenda: studying the desirability of approximation algorithms as voting rules. Care will be taken to present the necessary concepts from social choice theory using as many PowerPoint animations as humanly possible. This talk is based on joint work with Ioannis Caragiannis, Jason Covey, Michal Feldman, Chris Homan, Christos Kaklamanis, Nikos Karanikolas, and Jeff Rosenschein. 
September 5 
Joe Halpern, Cornell University
Defaults and Normality in Causal Structures
The HalpernPearl (HP) definition of causality has been shown to deal well with many examples that have caused problems for other approaches.
However, there is one class of problems with which it does not seem to deal well. I propose a general approach for dealing with these problems, motivated by the following wellknown observation made by Kahnemann and Miller in the psychology literature: ``an event is more likely to be undone by altering exceptional than routine aspects of the causal chain that led to it.''
I show how
this intuition can be captured by combining a theory of normality (captured by using AI techniques for representing defaults) with the HP definition. Doing so allows us to deal with all the problems that have been pointed out with the HP definition in a natural way.
This is actually a dry run for a talk I'll be giving in two weeks at the KR conference. No background in causality will be assumed; I'll review the HP definition from scratch.

September 12 
Yogi Sharma, Cornell University
Regret Bounds for Sleeping Experts and Bandits
Imagine the problem (or the joy) of a new student who just came to Ithaca and is trying to find the best restaurant in collegetown. His strategy is to try different restaurants to find the best one. At the end of his graduation (if that ever happens :)), he tries to compare the happiness that he got from eating at various places with the happiness that he would have gotten from eating at the best restaurant (in hindsight) all the time (oh well, forget about variety for now). The difference is termed ``regret'' in online learning, and the goal is to minimize regret. This is the standard MultiArmed Bandit (MAB) problem. The algorithm chooses one strategy at a time, and compares the total payoff with the best strategy in hindsight. The problem that the student faces, however, is slightly different. Not all restaurants are open all the time, meaning that the student can only choose to visit restaurants that are open when he chooses to go to dinner. How well his happiness compare with the offline best strategy (in hindsight)? What is the offline best strategy? It is not a single restaurant, since that one restaurant might not be open on some days. In this talk, I will discuss online decision problems where the set of actions that are available to the decision algorithm varies over time.
With a few notable exceptions, such problems remain largely unaddressed in the literature, despite their applicability to a large number of practical problems. Departing from previous work on this ``Sleeping Experts'' and ``Sleeping MultiArmed Bandit'' problem, we compare algorithm's payoff against the payoff obtained by the *best ordering* of the actions, which is a natural benchmark for this type of problems. In this talk, I will consider both the fullinformation (best expert) and partialinformation (multiarmed
bandit) settings and consider both stochastic and adaptive adversaries. For all settings we present algorithms that achieve almost informationtheoretically optimal regret bounds (up to a constant or a sublogarithmic factor) with respect to the bestordering benchmark.
This is joint work with Bobby Kleinberg and Alex NiculescuMizil, which appeared in the Conference on Learning Theory (COLT 2008). 
Wednesday,
September 17 
James Allen, University of Rochester
Host: Carla Gomes
NOTE: AI Seminar combined with Information Science Colloquium this week.
Wednesday, September 17, 4:005:00pm, 301 College Ave, Seminar Room 
September 26 
Arnon Lotem, TelAviv University
Host: Joe Halpern
Risk taking, Learning, and Strategy choice: Experiments with Humans, Bees, and Sparrows
Part of my talk will be based on our recent attempt to explain conflicting effects of certainty on risktaking behavior in humans and other animals (Shafir et al. 2008, Nature 453: 917920 ). This study shows that by manipulating perceptual accuracy in experiencebased tasks, both the certainty and the reversed certainty effects can emerge, suggesting that humans and animals match choices to the perceived best option in memory representation. I will then discuss the evolution of such decision rules in general, and also in relation to recent experimental and theoretical work on "learning to choose among strategies" in an evolutionary game (the producerscrounger game of social foraging). In such a game, learned strategy choice found to be better than innate choice not because it allows to learn the strategies played by others (it actually fails in this respect), but because it allows to fit strategy choice to self strengths, when these are phenotypically variable. I will end with some findings on the evolution of complex versus simple learning rules and some thoughts about how the form of natural selection may affect the evolution of risky learning rules. 
October 3 
Lukas Kroc, Cornell University
Message Passing Techniques for Reasoning about Clusters
Message passing algorithms such as Belief Propagation (BP) are widely used for probabilistic inference in graphical models. BP has been successfully applied to computational problems in rather diverse fields, such as computer vision, information theory, and statistical mechanics. Recently, the application of BP has also been explored for solving hard instances of constraint satisfaction problems, for example Boolean Satisfiability (SAT) and Graph Coloring (COL). Direct application of BP, unfortunately, runs into problems with convergence and accuracy, due to the spatial distribution of solutions in the search space, in particular, when solutions are organized in a large number of "clusters" comprised of "similar" solutions. To overcome these difficulties, a new technique, called Survey Propagation (SP), has been introduced in the physics community. SP reasons about solution clusters of the underlying problem rather than about individual solutions, and can solve hard random instances of combinatorial problems with millions of variables.
This surprising observation initiated an intensive research effort of broadening the applicability of similar algorithms.
In this talk, I will introduce a framework of reasoning about solution clusters that we have developed over the past year. Within this framework, the original SP algorithm for SAT can be derived as an instance of BP, and new algorithms developed for approximately counting solution clusters in other CSPs. At the same time, the framework allows one to prove certain interesting properties of clusters that give further insights into the geometry of solution spaces. Part of this work will be presented at the
NIPS'08 conference.
This is joint work with Ashish Sabharwal and Bart Selman.

October 10 
*No AI Seminar

October 17 
Ainur Yessenalina, Cornell University\
An Empirical Evaluation of Supervised Learning in High Dimensions
In this talk I will present the main results of an empirical evaluation of supervised learning methods that extends previous work by Alex NiculescuMizil and Rich Caruana. We consider eleven high dimensional problems and ten learning algorithms. We evaluate performance on three metrics: accuracy, AUC, and squared loss and study the effect of increasing dimensionality on the performance of the learning algorithms. Our findings are consistent with the previous study for problems of relatively low dimension, but suggest that as dimensionality increases the relative performance of the learning algorithms changes.
To our surprise, the method that performs consistently well across all dimensions is random forests, followed by neural nets, boosted trees, and SVMs.
Another effect that we had not anticipated is that calibration appears to be more important when learning from high dimensional data than when learning from low dimensional data. Even learning methods that are well calibrated in low dimensions such as neural nets and bagged trees benefited from calibration when learning from highdimensional data. Effects that we had anticipated such as the decrease in performance of memorybased methods like KNN with increasing dimensionality are clearly evident in our results.
This work was presented on International Conference in Machine Learning, 2008.
This is joint work with Rich Caruana and Nikos Karampatziakis

October 24 
Ashutosh Saxena, Stanford University
Host: Dan Huttenlocher
Robotic Grasping and Depth Perception: Learning 3D Models from a Single Image
We present an algorithm to convert standard digital pictures into 3D models.
This is a challenging problem, since an image is formed by a projection of the 3D scene onto two dimensions, thus losing the depth information. We take a supervised learning approach to this problem, and use a Markov Random Field (MRF) to model the scene depth as a function of the image features. We show that, even on unstructured scenes of a large variety of environments, our algorithm is frequently able to recover fairly accurate 3D models.
To convert your own image of an outdoor scene, landscape, etc. to a 3D model, please visit:
http://make3d.stanford.edu
We then extend our model to incorporate triangulation (stereo) cues, and use it to produce large scale 3d models from a few images.
We also apply our methods to robotics applications: (a) obstacle avoidance for autonomously driving a small electric car, and
(b) robot manipulation, where we develop visionbased learning algorithms for grasping novel objects. This enables our robot to perform tasks such as open new doors, clear up cluttered tables, and unload items from a dishwasher.
This is joint work with Prof. Andrew Y. Ng.

October 31 
Jure Leskovec, Cornell University
Community Structure in Large Networks: Natural Cluster Sizes and the
Absence of Large WellDefined Clusters
A large body of work has been devoted to defining and identifying
clusters or communities in social and information networks. We employ approximation algorithms for the graph partitioning problem to
characterize as a function of size the statistical and structural properties of network clusters. In particular, we define the network
community profile plot, which characterizes the "best" possible communityaccording to the conductance measureover a wide range of size scales. We study over 100 large realworld social and information
networks.
We explore several questions related to identifying communities in large social and information networks, and we come to several surprising
conclusions. In particular, we observe tight clusters that are barely
connected to the rest of the network at size scales of around 100 nodes;
and communities of larger size scales gradually "blend into" the
expanderlike core of the network and thus become less "communitylike."
This behavior is not explained, even at a qualitative level, by any of
the commonlyused network generation models. Moreover, it is exactly the opposite of what one would expect based on intuition from expander
graphs, lowdimensional or manifoldlike graphs, and from small social
networks that have served as testbeds of community detection algorithms. We have found that a generative graph model, in which new edges are added via an iterative "forest fire" burning process, is able to produce
graphs exhibiting a network community profile plot similar to what we observe in our network datasets.
This is joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney
http://arxiv.org/abs/0810.1355

November 7 
Pedro Domingos, University of Washington
Host: Bart Selman
Markov Logic: An Interface Layer for AI
Most subfields of computer science have an interface layer via which applications communicate with the infrastructure, and this is key to their success (e.g., the Internet in networking, the relational model in databases, etc.). So far this interface layer has been missing in AI. Firstorder logic and graphical models each have some of the necessary features, but a viable interface layer requires combining both. Markov logic is a powerful new language that accomplishes this by attaching weights to firstorder formulas and treating them as templates for features of Markov random fields. Most statistical models in wide use are special cases of Markov logic, and firstorder logic is its infiniteweight limit. I will survey Markov logic representation, inference, learning and applications.
Inference algorithms combine ideas from satisfiability testing, resolution, Markov chain Monte Carlo and belief propagation. Learning algorithms involve statistical weight learning and inductive logic programming. Markov logic has been successfully applied to a wide range of problems, including information extraction, social network modeling, robot mapping, computational biology, and others.
It is the basis of the opensource Alchemy system.
This is joint work with Stanley Kok, Daniel Lowd, Hoifung Poon, Matt Richardson, Parag Singla, Marc Sumner, and Jue Wang.

November 14 
John T. Hale, Cornell University
Humanlike parsing
How do human readers comprehend sentences? This talk presents a cognitive model of sentence understanding based on wordtoword
connections. Relating the concepts "lengthy cognitive process" and "surprising automaton transition", the derived predictions help explain
eyefixation durations in ways that linguistically impoverished models cannot. This highlights the role of sentence structure in
comprehension and points to the slippage between ideal and human like parsing.
Joint work with Marisa F. Boston, Reinhold Kliegl, Umesh Patil and
Shravan Vasishth (Cornell University, USA
and Potsdam University, Germany)
Link to lab: http://conf.ling.cornell.edu/compling
Host: Lillian Lee 
November 21 
*No AI Seminar  ACSU Luncheon 
November 28 
*No AI Seminar  Thanksgiving 
December 5 
David Pennock, Yahoo!
Combinatorial Betting
Most betting markets  from Las Vegas to Wall Street  operate
similarly: Each allowable bet is explicitly listed and tracked; each bet’s outcome space is low dimensional; and each bet type is managed independently. In this talk, I will survey our attempts to design combinatorial betting mechanisms that support doubly exponentially many allowable bets and propagate information appropriately across logicallyrelated bets. Thus, our mechanisms have the potential to both collect more information and process that information more fully than standard mechanisms. The greater expressiveness comes at a computational cost for the auctioneer, who faces a difficult matching problem to connect up willing traders. In general, the auctioneer must consider complex multilateral matches and full expressiveness renders the matching problem intractable. However, in some cases we have discovered reasonable restrictions on expressiveness that admit polynomialtime matching algorithms.
Host: Joe Halpern 