Artificial Intelligence Seminar

Spring 2005
Friday 12:00-1:15
Upson 5130

Sponsored by the Intelligent Information Systems Institute (IISI),
Computing and Information Science, Cornell

The AI seminar will meet weekly for lectures by graduate students, faculty, and researchers emphasizing work-in-progress and recent results in AI research. Lunch will be served starting at noon, with the talks running between 12:15 and 1:15. The new format is designed to allow AI chit-chat before the talks begin. Also, we're trying to make some of the presentations less formal so that students and faculty will feel comfortable using the seminar to give presentations about work in progress or practice talks for conferences.

Date
Title/Speaker/Abstract/Host

February 11 "Authorization Policies" by Vicki Weissman
An authorization policy states the conditions under which an action is permitted or forbidden. For example, in the world of James Bond, British Intelligence (BI) has the policy `any agent whose number begins with `00' is permitted to assassinate people as he sees fit' (i.e., has a license to kill). Now suppose that the British government has the policy `British subjects are not permitted to assassinate each other'. Should we conclude that James Bond, who is Agent 007, is not a British subject? If he is, may he assassinate others or not? To answer this question, as well as many others that are more practical in nature, Joe Halpern and I have discovered a fragment of first-order logic that we call Lithium. In this talk, I define the Lithium language and show how we can determine if an action is permitted or forbidden given (1) the policies of each policymaker (e.g., BI or the British government) , (2) merging rules (e.g., `BI's policies supercedes the government's' or `an action is permitted only if it's permitted by every policymaker'), and (3) a set of environment facts (e.g., `James Bond is Agent 007') all written in Lithium.

February 18 "General Principles of Constraint Programming" by Jean-Charles Regin
Constraint programming (CP) is a general and powerful method for solving some combinatorial problems. This method has been successfully used to solve a large range of real-life applications (rostering, time-tabling, car manufacturing, scheduling etc...), which can be quite different. CP is on the borderline of AI and OR. CP inherits its language aspect (that is the easy way to define a problem), its flexibility (that is the fact that problems can be easily modified) and its ability to benefit from the domain specific knowledge from AI. Operations Research brings to CP a lot of nice and powerful algorithms which give it the computational speed needed to solve real world problems.
In this talk I will introduce the general principles of this method. I will demonstrate its originality and its flexibility. I will then show its strength and its weakness. Notably, I will detail how some OR algorithms are used and why no restriction is made on the type of constraints that can be considered. I will also compare CP to several other approaches (like MIP, Greedy Algorithm, SAT, Local Search, etc.) and explain how all these methods are often combined thanks to CP in order to solve real world applications. Some computational results will be given.

February 25 cancelled

March 4 Prospective Ph.D. student visit day --- no AI Seminar

March 11 "Evidence with Uncertain Likelihoods" by Riccardo Pucella
Many scenarios in AI can be modeled by having agents make decisions based on observations that help them decide between competing hypotheses. Intuitively, the observations may provide more evidence for some of the hypotheses than some others, which can help guide decisions. Evidence can be quantified in a number of ways, all essentially equivalent: they are based on comparing the likelihood of the observation under each hypothesis (i.e., assuming hypothesis h holds, what is the probability of observing ob?) The goal here is to understand what can be done when hypotheses do not determine the likelihood functions. I compare a number of approaches, attempting to justify them on the basis of their reasonableness for decision-making.

March 18
(cancelled) "Scaling, Renormalization and Universality in Combinatorial Games: the Geometry of Chomp" by Adam Landsberg (visiting Associate Professor in Cornell ORIE)
Combinatorial games, which include chess, Go, checkers, Chomp, dots-and-boxes, and Nim, have both captivated and challenged mathematicians, computer scientists, and players alike. Analysis of these two-player games has generally relied upon a few beautiful analytical results or on numerical algorithms that combine heuristics with look-ahead approaches (alpha-beta pruning). Using Chomp as a prototype, we report on a new geometrical approach which unveils unexpected parallels between combinatorial games and key ideas from physics and dynamical systems, most notably notions of scaling, renormalization, universality, and chaotic attractors. Our central finding is that underlying the game is a probabilistic geometric structure that encodes essential information about the game, and that this structure exhibits a type of scale invariance: Loosely speaking, the geometry of small winning positions and large winning positions are the same after rescaling. (This general finding also holds for at least some other combinatorial games, as we explicitly demonstrate with a variant of Nim.) This geometric insight not only provides (probabilistic) answers to some open questions about Chomp, but suggests a natural pathway toward a new class of algorithms for more general combinatorial games, and hints at deeper links between such games and nonlinear science.
Joint Work with Eric Friedman, ORIE.

March 25 Spring Break

April 1 "Sampling, Counting, and Probabilistic Inference" by Wei Wei
We introduce ApproxCount, an algorithm that approximates the number of satisfying assignments or models of a formula in propositional logic. Many AI tasks, such as probabilistic inference and knowledge compilation, are computationally equivalent to model counting. It has been shown that model counting in even the most restrictive formalisms, such as Horn logic, monotone CNF and 2CNF, is intractable in the worst-case. Moreover, even approximate model counting remains a worst-case intractable problem. So far, most practical model counting algorithms are based on backtrack style algorithms such as the DPLL procedure. These algorithms typically yield exact counts but are limited to relatively small formulas. Our ApproxCount algorithm is based on SampleSat, an algorithm that samples quite effectively from the solution space of a propositional logic formula. We provide experimental results for formulas from a variety of domains. The algorithm produces good estimates for formulas much larger than those that can be handled by existing algorithms.
Joint work with Bart Selman.

April 8 "Scaling, Renormalization and Universality in Combinatorial Games: the Geometry of Chomp" by Adam Landsberg (visiting Associate Professor in Cornell ORIE)
Combinatorial games, which include chess, Go, checkers, Chomp, dots-and-boxes, and Nim, have both captivated and challenged mathematicians, computer scientists, and players alike. Analysis of these two-player games has generally relied upon a few beautiful analytical results or on numerical algorithms that combine heuristics with look-ahead approaches (alpha-beta pruning). Using Chomp as a prototype, we report on a new geometrical approach which unveils unexpected parallels between combinatorial games and key ideas from physics and dynamical systems, most notably notions of scaling, renormalization, universality, and chaotic attractors. Our central finding is that underlying the game is a probabilistic geometric structure that encodes essential information about the game, and that this structure exhibits a type of scale invariance: Loosely speaking, the geometry of small winning positions and large winning positions are the same after rescaling. (This general finding also holds for at least some other combinatorial games, as we explicitly demonstrate with a variant of Nim.) This geometric insight not only provides (probabilistic) answers to some open questions about Chomp, but suggests a natural pathway toward a new class of algorithms for more general combinatorial games, and hints at deeper links between such games and nonlinear science.
Joint Work with Eric Friedman, ORIE

April 15 "Query Chains: Learning to Rank from Implicit Feedback" by Filip Radlinski
Observing that users searching the web often perform a sequence, or chain, of queries with a similar information need, we propose a new approach for using clickthrough data to learn ranking functions for web search. Using query chains, we generate new types of preference judgments from search engine logs, thus taking advantage of user intelligence in reformulating queries. To validate our method we will present a comparison of the generated preference judgments from a controlled user study with explicit relevance judgments. We also implemented a real-world search engine to test our approach, using a modified ranking SVM to learn an improved ranking function. We will show that our results demonstrate a significant improvement in the ranking given by the search engine.
Joint work with Thorsten Joachims.

April 22 "Relaxations and Filtering in Combinatorial Optimization" by Meinolf Sellmann
For about 10 years now a very active research community has tried to combine the complementary strengths of Constraint Programming and Operations Research. CP is great in detecting and avoiding local inconsistencies (by means of 'filtering'), whereas OR has a lot to offer when it comes to achieving a holistic view on a problem (by means of 'relaxations'). In this talk I will describe an idea towards the integeration of relaxations and filtering: a decomposition method called Lagrangean relaxation can be used to exploit local structure for filtering while maintaining a global view on the problem.

April 29 "Error Bounds for Correlation Clustering" by Thorsten Joachims
While we have gained a detailed learning theoretical understanding of supervised learning over the last decades, our understanding of unsupervised clustering is still rather limited. For example, how much data is necessary so that a clustering algorithm outputs a reliable clustering? How does the amount of data depend on the distribution of the data? Is the particular clustering produced by some algorithm significant?
This talk addresses these questions for a particular graph-based clustering algorithm, namely correlation clustering [Bansal et al., 2002]. In particular, we give bounds on the error with which correlation clustering recovers the correct partition in a planted partition model [McSherry, 2001]. Using these bounds, we analyze how the accuracy of correlation clustering scales with the number of clusters and the sparsity of the graph. We also propose a statistical test that analyzes the significance of the clustering found by correlation clustering.
Joint work with John Hopcroft.

May 6 "Two Models for Semi-supervised Learning" by Tong Zhang
Semi-supervised learning, i.e. statistical machine learning methods that use both labeled and unlabeled data, have been actively investigated in recent years. In this talk, I will discuss two models for semi-supervised learning.
In the first part, I will present a statistical analysis of recently popularized graph (aka, manifold) based semi-supervised learning methods, which answers a number of previously unresolved questions. A critical component of our analysis is an equivalence of graph-based semi-supervised learning and supervised kernel learning, which we establish first. As an easy but important consequence, we prove the convergence of graph methods when the number of unlabeled data increases. We then show that graph based semi-supervised learning can be viewed as a special case of a class of unsupervised kernel design methods. By using a generalization analysis, we can show that good unsupervised kernel design will be helpful and we will present the optimal kernel. We will then use this result to show why graph-based methods are often effective. Moreover, under the framework of unsupervised kernel design, I can discuss the limitations of graph methods, alternatives, and the relationship of several different methods. Consequences of this analysis will be illustrated with empirical experiments.
In the second part of talk, I will present a new framework for semi-supervised learning that is highly effective. The idea is to learn predictive structures from many auxiliary problems that are created from the unlabeled data (and are related to the target problem), and then transfer the learned structure to the supervised target problem. I will present a general framework for structural learning with theoretical analysis. Under this framework, a bi-linear formulation, which can be solved by an iterative SVD procedure, will be proposed. I will then explain how to apply this idea to semi-supervised learning, and show that the approach is provably effective under certain models. Some empirical results will be given in the end.
Joint work with Rie Ando.

See also the AI graduate study brochure.

Please contact any of the faculty below if you'd like to give a talk this semester. We especially encourage graduate students to sign up!

CS772, Spring '05
Claire Cardie
Rich Caruana
Carla Gomes
Joe Halpern
Dan Huttenlocher
Thorsten Joachims
Lillian Lee
Bart Selman
Golan Yona
Ramin Zabih

Back to CS course websites

Date	Title/Speaker/Abstract/Host
February 11	*"Authorization Policies"* by Vicki Weissman An authorization policy states the conditions under which an action is permitted or forbidden. For example, in the world of James Bond, British Intelligence (BI) has the policy `any agent whose number begins with `00' is permitted to assassinate people as he sees fit' (i.e., has a license to kill). Now suppose that the British government has the policy `British subjects are not permitted to assassinate each other'. Should we conclude that James Bond, who is Agent 007, is not a British subject? If he is, may he assassinate others or not? To answer this question, as well as many others that are more practical in nature, Joe Halpern and I have discovered a fragment of first-order logic that we call Lithium. In this talk, I define the Lithium language and show how we can determine if an action is permitted or forbidden given (1) the policies of each policymaker (e.g., BI or the British government) , (2) merging rules (e.g., `BI's policies supercedes the government's' or `an action is permitted only if it's permitted by every policymaker'), and (3) a set of environment facts (e.g., `James Bond is Agent 007') all written in Lithium.
February 18	*"General Principles of Constraint Programming"* by Jean-Charles Regin Constraint programming (CP) is a general and powerful method for solving some combinatorial problems. This method has been successfully used to solve a large range of real-life applications (rostering, time-tabling, car manufacturing, scheduling etc...), which can be quite different. CP is on the borderline of AI and OR. CP inherits its language aspect (that is the easy way to define a problem), its flexibility (that is the fact that problems can be easily modified) and its ability to benefit from the domain specific knowledge from AI. Operations Research brings to CP a lot of nice and powerful algorithms which give it the computational speed needed to solve real world problems. In this talk I will introduce the general principles of this method. I will demonstrate its originality and its flexibility. I will then show its strength and its weakness. Notably, I will detail how some OR algorithms are used and why no restriction is made on the type of constraints that can be considered. I will also compare CP to several other approaches (like MIP, Greedy Algorithm, SAT, Local Search, etc.) and explain how all these methods are often combined thanks to CP in order to solve real world applications. Some computational results will be given.
February 25	cancelled
March 4	Prospective Ph.D. student visit day --- no AI Seminar
March 11	*"Evidence with Uncertain Likelihoods"* by Riccardo Pucella Many scenarios in AI can be modeled by having agents make decisions based on observations that help them decide between competing hypotheses. Intuitively, the observations may provide more evidence for some of the hypotheses than some others, which can help guide decisions. Evidence can be quantified in a number of ways, all essentially equivalent: they are based on comparing the likelihood of the observation under each hypothesis (i.e., assuming hypothesis h holds, what is the probability of observing ob?) The goal here is to understand what can be done when hypotheses do not determine the likelihood functions. I compare a number of approaches, attempting to justify them on the basis of their reasonableness for decision-making.
March 18 (cancelled)	*"Scaling, Renormalization and Universality in Combinatorial Games: the Geometry of Chomp"* by Adam Landsberg (visiting Associate Professor in Cornell ORIE) Combinatorial games, which include chess, Go, checkers, Chomp, dots-and-boxes, and Nim, have both captivated and challenged mathematicians, computer scientists, and players alike. Analysis of these two-player games has generally relied upon a few beautiful analytical results or on numerical algorithms that combine heuristics with look-ahead approaches (alpha-beta pruning). Using Chomp as a prototype, we report on a new geometrical approach which unveils unexpected parallels between combinatorial games and key ideas from physics and dynamical systems, most notably notions of scaling, renormalization, universality, and chaotic attractors. Our central finding is that underlying the game is a probabilistic geometric structure that encodes essential information about the game, and that this structure exhibits a type of scale invariance: Loosely speaking, the geometry of small winning positions and large winning positions are the same after rescaling. (This general finding also holds for at least some other combinatorial games, as we explicitly demonstrate with a variant of Nim.) This geometric insight not only provides (probabilistic) answers to some open questions about Chomp, but suggests a natural pathway toward a new class of algorithms for more general combinatorial games, and hints at deeper links between such games and nonlinear science. Joint Work with Eric Friedman, ORIE.
March 25	Spring Break
April 1	"Sampling, Counting, and Probabilistic Inference" by Wei Wei We introduce ApproxCount, an algorithm that approximates the number of satisfying assignments or models of a formula in propositional logic. Many AI tasks, such as probabilistic inference and knowledge compilation, are computationally equivalent to model counting. It has been shown that model counting in even the most restrictive formalisms, such as Horn logic, monotone CNF and 2CNF, is intractable in the worst-case. Moreover, even approximate model counting remains a worst-case intractable problem. So far, most practical model counting algorithms are based on backtrack style algorithms such as the DPLL procedure. These algorithms typically yield exact counts but are limited to relatively small formulas. Our ApproxCount algorithm is based on SampleSat, an algorithm that samples quite effectively from the solution space of a propositional logic formula. We provide experimental results for formulas from a variety of domains. The algorithm produces good estimates for formulas much larger than those that can be handled by existing algorithms. Joint work with Bart Selman.
April 8	*"Scaling, Renormalization and Universality in Combinatorial Games: the Geometry of Chomp"* by Adam Landsberg (visiting Associate Professor in Cornell ORIE) Combinatorial games, which include chess, Go, checkers, Chomp, dots-and-boxes, and Nim, have both captivated and challenged mathematicians, computer scientists, and players alike. Analysis of these two-player games has generally relied upon a few beautiful analytical results or on numerical algorithms that combine heuristics with look-ahead approaches (alpha-beta pruning). Using Chomp as a prototype, we report on a new geometrical approach which unveils unexpected parallels between combinatorial games and key ideas from physics and dynamical systems, most notably notions of scaling, renormalization, universality, and chaotic attractors. Our central finding is that underlying the game is a probabilistic geometric structure that encodes essential information about the game, and that this structure exhibits a type of scale invariance: Loosely speaking, the geometry of small winning positions and large winning positions are the same after rescaling. (This general finding also holds for at least some other combinatorial games, as we explicitly demonstrate with a variant of Nim.) This geometric insight not only provides (probabilistic) answers to some open questions about Chomp, but suggests a natural pathway toward a new class of algorithms for more general combinatorial games, and hints at deeper links between such games and nonlinear science. Joint Work with Eric Friedman, ORIE
April 15	"Query Chains: Learning to Rank from Implicit Feedback" by Filip Radlinski Observing that users searching the web often perform a sequence, or chain, of queries with a similar information need, we propose a new approach for using clickthrough data to learn ranking functions for web search. Using query chains, we generate new types of preference judgments from search engine logs, thus taking advantage of user intelligence in reformulating queries. To validate our method we will present a comparison of the generated preference judgments from a controlled user study with explicit relevance judgments. We also implemented a real-world search engine to test our approach, using a modified ranking SVM to learn an improved ranking function. We will show that our results demonstrate a significant improvement in the ranking given by the search engine. Joint work with Thorsten Joachims.
April 22	"Relaxations and Filtering in Combinatorial Optimization" by Meinolf Sellmann For about 10 years now a very active research community has tried to combine the complementary strengths of Constraint Programming and Operations Research. CP is great in detecting and avoiding local inconsistencies (by means of 'filtering'), whereas OR has a lot to offer when it comes to achieving a holistic view on a problem (by means of 'relaxations'). In this talk I will describe an idea towards the integeration of relaxations and filtering: a decomposition method called Lagrangean relaxation can be used to exploit local structure for filtering while maintaining a global view on the problem.
April 29	"Error Bounds for Correlation Clustering" by Thorsten Joachims While we have gained a detailed learning theoretical understanding of supervised learning over the last decades, our understanding of unsupervised clustering is still rather limited. For example, how much data is necessary so that a clustering algorithm outputs a reliable clustering? How does the amount of data depend on the distribution of the data? Is the particular clustering produced by some algorithm significant? This talk addresses these questions for a particular graph-based clustering algorithm, namely correlation clustering [Bansal et al., 2002]. In particular, we give bounds on the error with which correlation clustering recovers the correct partition in a planted partition model [McSherry, 2001]. Using these bounds, we analyze how the accuracy of correlation clustering scales with the number of clusters and the sparsity of the graph. We also propose a statistical test that analyzes the significance of the clustering found by correlation clustering. Joint work with John Hopcroft.
May 6	"Two Models for Semi-supervised Learning" by Tong Zhang Semi-supervised learning, i.e. statistical machine learning methods that use both labeled and unlabeled data, have been actively investigated in recent years. In this talk, I will discuss two models for semi-supervised learning. In the first part, I will present a statistical analysis of recently popularized graph (aka, manifold) based semi-supervised learning methods, which answers a number of previously unresolved questions. A critical component of our analysis is an equivalence of graph-based semi-supervised learning and supervised kernel learning, which we establish first. As an easy but important consequence, we prove the convergence of graph methods when the number of unlabeled data increases. We then show that graph based semi-supervised learning can be viewed as a special case of a class of unsupervised kernel design methods. By using a generalization analysis, we can show that good unsupervised kernel design will be helpful and we will present the optimal kernel. We will then use this result to show why graph-based methods are often effective. Moreover, under the framework of unsupervised kernel design, I can discuss the limitations of graph methods, alternatives, and the relationship of several different methods. Consequences of this analysis will be illustrated with empirical experiments. In the second part of talk, I will present a new framework for semi-supervised learning that is highly effective. The idea is to learn predictive structures from many auxiliary problems that are created from the unlabeled data (and are related to the target problem), and then transfer the learned structure to the supervised target problem. I will present a general framework for structural learning with theoretical analysis. Under this framework, a bi-linear formulation, which can be solved by an iterative SVD procedure, will be proposed. I will then explain how to apply this idea to semi-supervised learning, and show that the approach is provably effective under certain models. Some empirical results will be given in the end. Joint work with Rie Ando.

Artificial Intelligence Seminar

Spring 2005 Friday 12:00-1:15 Upson 5130

Sponsored by the Intelligent Information Systems Institute (IISI), Computing and Information Science, Cornell

Spring 2005
Friday 12:00-1:15
Upson 5130

Sponsored by the Intelligent Information Systems Institute (IISI),
Computing and Information Science, Cornell