Artificial Intelligence Seminar

Spring 2006
Friday 12:00-1:15
Upson 5130

Sponsored by the Intelligent Information Systems Institute (IISI),
Computing and Information Science, Cornell

The AI seminar will meet weekly for lectures by graduate students, faculty, and researchers emphasizing work-in-progress and recent results in AI research. Lunch will be served starting at noon, with the talks running between 12:15 and 1:15. The new format is designed to allow AI chit-chat before the talks begin. Also, we're trying to make some of the presentations less formal so that students and faculty will feel comfortable using the seminar to give presentations about work in progress or practice talks for conferences.

Date
Title/Speaker/Abstract/Host

January 27 Representations and Algorithms for Monitoring Dynamic Systems
Avi Pfeffer, Havard University, EECS
Continually monitoring the state of a dynamic system is an important problem for artificial intelligence. Dynamic Bayesian networks (DBNs) provide for compact representation of probabilistic dynamic models. However the monitoring task is extremely difficult even for well-factored DBNs. Therefore approximate monitoring algorithms are needed. One family of approximate monitoring algorithms is based on the idea of factoring the joint distribution over the state of the system into a product of distributions over factors consisting of subsets of variables. Factoring relies on the notion of weak interaction between subsystems. We identify a new notion of weak interaction called separability, and show that it leads to the property that, in order to compute the factor distributions at one point in time, only the factored distributions at the previous time point are needed. We also define an approximate form of separability. We show that separability and approximate separability lead to very good approximations for the monitoring task.

Unfortunately, sometimes the factoring approach is computationally infeasible. An alternative approach to approximate monitoring is particle filtering (PF), in which the joint distribution over the state of the system is approximated by a set of samples, or particles. In high dimensional spaces, the variance of PF is high and too many particles are required to provide good performance. We improve the performance of PF by introducing factoring, maintaining particles over factors instead of the global state space. This has the effect of reducing the variance of PF and so reducing its error. Maintaining factored particles also allows us to improve PF by looking ahead to future evidence before deciding which particles to propagate, thus leading to much better accuracy.

February 3 No AI Seminar (CS Field Meeting)

February 10 Life-Sized Learning
Leslie Pack Kaelbling, Massachusetts Institute of Technology, CSAIL
In the last 10 years, the combination of techniques from machine learning and operations research has allowed major advances in learning and planning for uncertain environments. Reasonably large problems can be solved using current techniques. But what if we want to scale up to the uncertain learning and planning problem that you face every day? It is many orders of magnitude larger than the biggest problem we can solve currently.

In this talk, I'll describe three early pieces of work that try to begin to address working in truly huge environments. The first is a method for learning probabilistic rules to describe naive physics models of the interactions between objects. The second is an uncertain planning algorithm that uses the rules learned by the first method to construct contingency plans that consider enough cases to perform robustly, but are much smaller than complete policies. The last piece is preliminary work on combining multiple abstraction methods dynamically, in order to allow an agent to have a working model of the environment that changes focus depending on the current situation.

February 17 Factored planning: How, When, and When Not
Carmel Domshlak, Technion, Industrial Engineering and Management.
Automated domain factoring in classical planning, and planning methods that utilize problem factorization, have long been of interest to planning researchers. Recent work in this area yielded new theoretical insight and algorithms, but left many questions open: How to decompose a domain into factors? How to work with these factors? And whether and when decomposition-based methods are useful?

In this talk I'll present our recent theoretical analysis that answers many of these questions: it proposes a novel approach to factored planning; proves its theoretical superiority over previous methods; provides insight into how to factor domains; and uses its novel complexity results to analyze when factored planning is likely to perform well, and when not. In particular, our analysis establishes the key role played by the domain's causal graph in the complexity analysis of algorithms for classical planning.

Joint work with Ronen Brafman, Ben-Gurion Univ. (currently in Stanford.)

February 24

March 3 No AI Seminar (Ph.D. Student Visit Day)

March 10

March 17 Mining Citizen Science Data to Predict Abundance of Wild Bird Species
Mirek Riedewald, Daria Sorokina, Art Munson

The Cornell Laboratory of Ornithology's mission is to interpret and conserve the earth's biological diversity through research, education, and citizen science focused on birds. Over the years, the Lab has accumulated one of the largest and longest-running collections of environmental data sets in existence. These data sets are growing rapidly, thanks to thousands of bird enthusiasts who are participating in different bird observation projects and are voluntarily reporting bird sightings. The data sets are not only large, but also have many attributes, contain many missing values, and potentially are very noisy. The ecologists are interested in identifying which features have the strongest effect on the distribution and abundance of bird species as well describing the forms of these relationships. Statistical and human-expert centered techniques that have been traditionally used by the community do not scale to the size and dimensionality of the accumulated data. We show how data mining can be successfully applied, enabling the ecologists to discover unanticipated relationships. We compare a variety of methods for measuring attribute importance with respect to the probability of a bird being observed at a feeder and discuss the biological relevance of the results.
Joint work of Rich Caruana, Mohamed Elhawary, Art Munson, Mirek Riedewald, Daria Sorokina from Computer Science; Daniel Fink, Wesley M. Hochachka, Steve Kelling from the Lab of Ornithology.

March 24 Spring Break

March 31 No AI Seminar (ACSU Lunch)

April 7 Treebank Feature Constraint Grammars / Estimation of Valences by a Modified Inside-Outside Procedure
Mats Rooth, Cornell University, Linguistics
Treebank-aligned feature constraint grammars are grammars whose context free backbone is aligned with the treebank. I'll describe a development methodology for these grammars based on parsing technology. The result is a treebank database augmented with linguistic features, a constraint grammar which licenses that tree set, and PCFGs which are compiled by incorporating features. The second part of the talk discusses an experiment in progress on estimating parameters for treebank PCFGs with incorporated verbal valence features. Lexical parameters are estimated by combining frequencies from the feature-augmented treebank with frequencies computed by the inside-outside algorithm in an independent (and potentially much larger) sample. This is realized in an iterative procedure which interleaves a frequency transformation into iterative inside-outside estimation of PCFGs.

April 14 Weakly Supervised Learning of Part-Based Spatial Models for Visual Object Recognition
David Crandall
In this talk we present a new method of learning part-based models for visual object recognition, from training data that only provides information about class membership (and not object location or configuration). This method learns both a model of local part appearance and a model of the spatial relations between those parts. In contrast, other work using such a weakly supervised learning paradigm has not considered the problem of simultaneously learning appearance and spatial models. Some of these methods use a ``bag'' model where only part appearance is considered whereas other methods learn spatial models but only given the output of a particular feature detector. Previous techniques for learning both part appearance and spatial relations have instead used a highly supervised learning process that provides substantial information about object part location. We show that our weakly supervised technique produces better results than these previous highly supervised methods. Moreover, we investigate the degree to which both richer spatial models and richer appearance models are helpful in improving recognition performance. Our results show that while both spatial and appearance information can be useful, the effect on performance depends substantially on the particular object class and on the difficulty of the test dataset.

This is a pratice talk for a presentation at the European Conference on Computer Vision (ECCV) in May.

This is joint work with Dan Huttenlocher.

April 21 Inductive Transfer for Bayesian Network Structure Learning (A-Exam Presentation)
Alexandru Niculescu-Mizil
We consider the problem of learning Bayes Net structures for related tasks. We present an algorithm for learning Bayes Net structures that takes advantage of the similarity between tasks by biasing learning toward similar structures for each task. Heuristic search is used to find a high scoring set of structures (one for each task), where the score for a set of structures is computed in a principled way. Experiments on synthetic problems generated from the ALARM and INSURANCE networks show that learning the structures for related tasks using the proposed method yields better results than learning the structures independently.

Joint work with Rich Caruana.

April 28 No AI Seminar (NESCAI Conference)

May 5 Clustering and Prediction in Text and Databases
Lyle Ungar, University of Pennsylvania, Computing and Information Science.
This talk describes two new ways of combining clustering and prediction to discover latent structure and build predictive models for text and database mining. A new clustering method, CPCM, learns to optimally weight different features in a distance metric by finding "predictable" clusters (e.g. of people, products or documents). Streamwise feature selection dynamically decides which features to test for inclusion in a predictive model, allowing search over billions of potential features such as words, products, people, and their clusters and interactions. Streamwise feature search is particularly important in searching over relational data, where generating features is more expensive than testing them.

Joint work with Dean Foster, Abhishek Gupta, Sasha Popescul, and Jing Zhou.

May 12 Information Theoretic Analysis of Biological Data
Noam Slonim, Princeton University
In recent years, researchers have been facing a rapid increase in the available biological data. These data come in a variety of forms - complete genome sequences, mRNA transcriptional profiles, protein-protein interactions, and so forth. Automatic data analysis methods are often the only route for extracting meaningful insights into these data. Existing techniques, however, typically employ nontrivial assumptions. These assumptions might be explicit, as in assuming a specific model which reflects one's prior beliefs about the data; or implicit, as in arbitrarily specifying a correlation or a ``similarity'' measure which lies at the core of any further analysis. While it is clear that such assumptions should be avoided, the conventional wisdom is that in practice they are actually unavoidable. In this talk I will describe an information theoretic framework that allows to extract biologically important insights without any prior assumptions about the nature of the data for a wide variety of problems. I will briefly discuss several recent applications of this approach, and will present in more detail results for systematic genotype-phenotype association in bacteria and archaea.

See also the AI graduate study brochure.

Please contact any of the faculty below if you'd like to give a talk this semester. We especially encourage graduate students to sign up!

CS772, Spring '06
Claire Cardie
Rich Caruana
Carla Gomes
Joe Halpern
Dan Huttenlocher
Thorsten Joachims
Lillian Lee
Bart Selman
Ramin Zabih

Back to CS course websites

Date	Title/Speaker/Abstract/Host
January 27	Representations and Algorithms for Monitoring Dynamic Systems Avi Pfeffer, Havard University, EECS Continually monitoring the state of a dynamic system is an important problem for artificial intelligence. Dynamic Bayesian networks (DBNs) provide for compact representation of probabilistic dynamic models. However the monitoring task is extremely difficult even for well-factored DBNs. Therefore approximate monitoring algorithms are needed. One family of approximate monitoring algorithms is based on the idea of factoring the joint distribution over the state of the system into a product of distributions over factors consisting of subsets of variables. Factoring relies on the notion of weak interaction between subsystems. We identify a new notion of weak interaction called separability, and show that it leads to the property that, in order to compute the factor distributions at one point in time, only the factored distributions at the previous time point are needed. We also define an approximate form of separability. We show that separability and approximate separability lead to very good approximations for the monitoring task. Unfortunately, sometimes the factoring approach is computationally infeasible. An alternative approach to approximate monitoring is particle filtering (PF), in which the joint distribution over the state of the system is approximated by a set of samples, or particles. In high dimensional spaces, the variance of PF is high and too many particles are required to provide good performance. We improve the performance of PF by introducing factoring, maintaining particles over factors instead of the global state space. This has the effect of reducing the variance of PF and so reducing its error. Maintaining factored particles also allows us to improve PF by looking ahead to future evidence before deciding which particles to propagate, thus leading to much better accuracy.
February 3	No AI Seminar (CS Field Meeting)
February 10	Life-Sized Learning Leslie Pack Kaelbling, Massachusetts Institute of Technology, CSAIL In the last 10 years, the combination of techniques from machine learning and operations research has allowed major advances in learning and planning for uncertain environments. Reasonably large problems can be solved using current techniques. But what if we want to scale up to the uncertain learning and planning problem that you face every day? It is many orders of magnitude larger than the biggest problem we can solve currently. In this talk, I'll describe three early pieces of work that try to begin to address working in truly huge environments. The first is a method for learning probabilistic rules to describe naive physics models of the interactions between objects. The second is an uncertain planning algorithm that uses the rules learned by the first method to construct contingency plans that consider enough cases to perform robustly, but are much smaller than complete policies. The last piece is preliminary work on combining multiple abstraction methods dynamically, in order to allow an agent to have a working model of the environment that changes focus depending on the current situation.
February 17	Factored planning: How, When, and When Not Carmel Domshlak, Technion, Industrial Engineering and Management. Automated domain factoring in classical planning, and planning methods that utilize problem factorization, have long been of interest to planning researchers. Recent work in this area yielded new theoretical insight and algorithms, but left many questions open: How to decompose a domain into factors? How to work with these factors? And whether and when decomposition-based methods are useful? In this talk I'll present our recent theoretical analysis that answers many of these questions: it proposes a novel approach to factored planning; proves its theoretical superiority over previous methods; provides insight into how to factor domains; and uses its novel complexity results to analyze when factored planning is likely to perform well, and when not. In particular, our analysis establishes the key role played by the domain's causal graph in the complexity analysis of algorithms for classical planning. Joint work with Ronen Brafman, Ben-Gurion Univ. (currently in Stanford.)
February 24
March 3	No AI Seminar (Ph.D. Student Visit Day)
March 10
March 17	Mining Citizen Science Data to Predict Abundance of Wild Bird Species Mirek Riedewald, Daria Sorokina, Art Munson The Cornell Laboratory of Ornithology's mission is to interpret and conserve the earth's biological diversity through research, education, and citizen science focused on birds. Over the years, the Lab has accumulated one of the largest and longest-running collections of environmental data sets in existence. These data sets are growing rapidly, thanks to thousands of bird enthusiasts who are participating in different bird observation projects and are voluntarily reporting bird sightings. The data sets are not only large, but also have many attributes, contain many missing values, and potentially are very noisy. The ecologists are interested in identifying which features have the strongest effect on the distribution and abundance of bird species as well describing the forms of these relationships. Statistical and human-expert centered techniques that have been traditionally used by the community do not scale to the size and dimensionality of the accumulated data. We show how data mining can be successfully applied, enabling the ecologists to discover unanticipated relationships. We compare a variety of methods for measuring attribute importance with respect to the probability of a bird being observed at a feeder and discuss the biological relevance of the results. Joint work of Rich Caruana, Mohamed Elhawary, Art Munson, Mirek Riedewald, Daria Sorokina from Computer Science; Daniel Fink, Wesley M. Hochachka, Steve Kelling from the Lab of Ornithology.
March 24	Spring Break
March 31	No AI Seminar (ACSU Lunch)
April 7	Treebank Feature Constraint Grammars / Estimation of Valences by a Modified Inside-Outside Procedure Mats Rooth, Cornell University, Linguistics Treebank-aligned feature constraint grammars are grammars whose context free backbone is aligned with the treebank. I'll describe a development methodology for these grammars based on parsing technology. The result is a treebank database augmented with linguistic features, a constraint grammar which licenses that tree set, and PCFGs which are compiled by incorporating features. The second part of the talk discusses an experiment in progress on estimating parameters for treebank PCFGs with incorporated verbal valence features. Lexical parameters are estimated by combining frequencies from the feature-augmented treebank with frequencies computed by the inside-outside algorithm in an independent (and potentially much larger) sample. This is realized in an iterative procedure which interleaves a frequency transformation into iterative inside-outside estimation of PCFGs.
April 14	Weakly Supervised Learning of Part-Based Spatial Models for Visual Object Recognition David Crandall In this talk we present a new method of learning part-based models for visual object recognition, from training data that only provides information about class membership (and not object location or configuration). This method learns both a model of local part appearance and a model of the spatial relations between those parts. In contrast, other work using such a weakly supervised learning paradigm has not considered the problem of simultaneously learning appearance and spatial models. Some of these methods use a ``bag'' model where only part appearance is considered whereas other methods learn spatial models but only given the output of a particular feature detector. Previous techniques for learning both part appearance and spatial relations have instead used a highly supervised learning process that provides substantial information about object part location. We show that our weakly supervised technique produces better results than these previous highly supervised methods. Moreover, we investigate the degree to which both richer spatial models and richer appearance models are helpful in improving recognition performance. Our results show that while both spatial and appearance information can be useful, the effect on performance depends substantially on the particular object class and on the difficulty of the test dataset. This is a pratice talk for a presentation at the European Conference on Computer Vision (ECCV) in May. This is joint work with Dan Huttenlocher.
April 21	Inductive Transfer for Bayesian Network Structure Learning (A-Exam Presentation) Alexandru Niculescu-Mizil We consider the problem of learning Bayes Net structures for related tasks. We present an algorithm for learning Bayes Net structures that takes advantage of the similarity between tasks by biasing learning toward similar structures for each task. Heuristic search is used to find a high scoring set of structures (one for each task), where the score for a set of structures is computed in a principled way. Experiments on synthetic problems generated from the ALARM and INSURANCE networks show that learning the structures for related tasks using the proposed method yields better results than learning the structures independently. Joint work with Rich Caruana.
April 28	No AI Seminar (NESCAI Conference)
May 5	Clustering and Prediction in Text and Databases Lyle Ungar, University of Pennsylvania, Computing and Information Science. This talk describes two new ways of combining clustering and prediction to discover latent structure and build predictive models for text and database mining. A new clustering method, CPCM, learns to optimally weight different features in a distance metric by finding "predictable" clusters (e.g. of people, products or documents). Streamwise feature selection dynamically decides which features to test for inclusion in a predictive model, allowing search over billions of potential features such as words, products, people, and their clusters and interactions. Streamwise feature search is particularly important in searching over relational data, where generating features is more expensive than testing them. Joint work with Dean Foster, Abhishek Gupta, Sasha Popescul, and Jing Zhou.
May 12	Information Theoretic Analysis of Biological Data Noam Slonim, Princeton University In recent years, researchers have been facing a rapid increase in the available biological data. These data come in a variety of forms - complete genome sequences, mRNA transcriptional profiles, protein-protein interactions, and so forth. Automatic data analysis methods are often the only route for extracting meaningful insights into these data. Existing techniques, however, typically employ nontrivial assumptions. These assumptions might be explicit, as in assuming a specific model which reflects one's prior beliefs about the data; or implicit, as in arbitrarily specifying a correlation or a ``similarity'' measure which lies at the core of any further analysis. While it is clear that such assumptions should be avoided, the conventional wisdom is that in practice they are actually unavoidable. In this talk I will describe an information theoretic framework that allows to extract biologically important insights without any prior assumptions about the nature of the data for a wide variety of problems. I will briefly discuss several recent applications of this approach, and will present in more detail results for systematic genotype-phenotype association in bacteria and archaea.

Artificial Intelligence Seminar

Spring 2006 Friday 12:00-1:15 Upson 5130

Sponsored by the Intelligent Information Systems Institute (IISI), Computing and Information Science, Cornell

Spring 2006
Friday 12:00-1:15
Upson 5130

Sponsored by the Intelligent Information Systems Institute (IISI),
Computing and Information Science, Cornell