The AI seminar will meet weekly for lectures by
graduate students, faculty, and researchers emphasizing work-in-progress and
recent results in AI research. Lunch will be served starting at noon, with the
talks running between 12:15 and 1:15. The new format is designed to allow AI
chit-chat before the talks begin. Also, we're trying to make some of the
presentations less formal so that students and faculty will feel comfortable
using the seminar to give presentations about work in progress or practice talks
for conferences.
| Date |
Title/Speaker/Abstract/Host |
| January 27 |
Representations and Algorithms for Monitoring Dynamic Systems Avi Pfeffer, Havard University, EECS Continually monitoring the
state of a dynamic system is an important problem for artificial
intelligence. Dynamic Bayesian networks (DBNs) provide for compact
representation of probabilistic dynamic models. However the monitoring task
is extremely difficult even for well-factored DBNs. Therefore approximate
monitoring algorithms are needed. One family of approximate monitoring
algorithms is based on the idea of factoring the joint distribution over the
state of the system into a product of distributions over factors consisting
of subsets of variables. Factoring relies on the notion of weak interaction
between subsystems. We identify a new notion of weak interaction called
separability, and show that it leads to the property that, in order to
compute the factor distributions at one point in time, only the factored
distributions at the previous time point are needed. We also define an
approximate form of separability. We show that separability and approximate
separability lead to very good approximations for the monitoring task. |
| February 3 | No AI Seminar (CS Field Meeting) |
| February 10 |
Life-Sized Learning Leslie Pack Kaelbling, Massachusetts Institute of Technology, CSAIL
In the last 10 years, the combination of techniques from machine learning
and operations research has allowed major advances in learning and planning
for uncertain environments. Reasonably large problems can be solved using
current techniques. But what if we want to scale up to the uncertain
learning and planning problem that you face every day? It is many orders of
magnitude larger than the biggest problem we can solve currently. |
| February 17 |
Factored planning: How, When, and When Not Carmel Domshlak, Technion, Industrial Engineering and Management.
Automated domain factoring in classical planning, and planning methods that
utilize problem factorization, have long been of interest to planning
researchers. Recent work in this area yielded new theoretical insight and
algorithms, but left many questions open: How to decompose a domain into
factors? How to work with these factors? And whether and when
decomposition-based methods are useful? |
| February 24 | |
| March 3 | No AI Seminar (Ph.D. Student Visit Day) |
| March 10 | |
| March 17 |
Mining Citizen Science Data to Predict Abundance of Wild Bird Species Mirek Riedewald, Daria Sorokina, Art Munson The Cornell Laboratory of Ornithology's mission is to interpret and conserve the earth's biological diversity through research, education, and citizen science focused on birds. Over the years, the Lab has accumulated one of the largest and longest-running collections of environmental data sets in existence. These data sets are growing rapidly, thanks to thousands of bird enthusiasts who are participating in different bird observation projects and are voluntarily reporting bird sightings. The data sets are not only large, but also have many attributes, contain many missing values, and potentially are very noisy. The ecologists are interested in identifying which features have the strongest effect on the distribution and abundance of bird species as well describing the forms of these relationships. Statistical and human-expert centered techniques that have been traditionally used by the community do not scale to the size and dimensionality of the accumulated data. We show how data mining can be successfully applied, enabling the ecologists to discover unanticipated relationships. We compare a variety of methods for measuring attribute importance with respect to the probability of a bird being observed at a feeder and discuss the biological relevance of the results. Joint work of Rich Caruana, Mohamed Elhawary, Art Munson, Mirek Riedewald, Daria Sorokina from Computer Science; Daniel Fink, Wesley M. Hochachka, Steve Kelling from the Lab of Ornithology. |
| March 24 | Spring Break |
| March 31 | No AI Seminar (ACSU Lunch) |
| April 7 | Treebank Feature Constraint Grammars /
Estimation of Valences by a Modified Inside-Outside Procedure Mats Rooth, Cornell University, Linguistics Treebank-aligned feature constraint grammars are grammars whose context free backbone is aligned with the treebank. I'll describe a development methodology for these grammars based on parsing technology. The result is a treebank database augmented with linguistic features, a constraint grammar which licenses that tree set, and PCFGs which are compiled by incorporating features. The second part of the talk discusses an experiment in progress on estimating parameters for treebank PCFGs with incorporated verbal valence features. Lexical parameters are estimated by combining frequencies from the feature-augmented treebank with frequencies computed by the inside-outside algorithm in an independent (and potentially much larger) sample. This is realized in an iterative procedure which interleaves a frequency transformation into iterative inside-outside estimation of PCFGs. |
| April 14 |
Weakly Supervised Learning of Part-Based Spatial Models for Visual
Object Recognition David Crandall In this talk we present a new method of learning part-based models for
visual object recognition, from training data that only provides
information about class membership (and not object location or
configuration). This method learns both a model of local part appearance
and a model of the spatial relations between those parts. In contrast,
other work using such a weakly supervised learning paradigm has not
considered the problem of simultaneously learning appearance and spatial
models. Some of these methods use a ``bag'' model where only part
appearance is considered whereas other methods learn spatial models but
only given the output of a particular feature detector. Previous
techniques for learning both part appearance and spatial relations have
instead used a highly supervised learning process that provides
substantial information about object part location. We show that our
weakly supervised technique produces better results than these previous
highly supervised methods. Moreover, we investigate the degree to which
both richer spatial models and richer appearance models are helpful in
improving recognition performance. Our results show that while both
spatial and appearance information can be useful, the effect on
performance depends substantially on the particular object class and on
the difficulty of the test dataset. |
| April 21 |
Inductive Transfer for Bayesian Network Structure Learning (A-Exam
Presentation) Alexandru Niculescu-Mizil We consider the problem of learning Bayes Net structures for related tasks. We present an algorithm for learning Bayes Net structures that takes advantage of the similarity between tasks by biasing learning toward similar structures for each task. Heuristic search is used to find a high scoring set of structures (one for each task), where the score for a set of structures is computed in a principled way. Experiments on synthetic problems generated from the ALARM and INSURANCE networks show that learning the structures for related tasks using the proposed method yields better results than learning the structures independently. Joint work with Rich Caruana. |
| April 28 | No AI Seminar (NESCAI Conference) |
| May 5 |
Clustering and Prediction in Text and Databases Lyle Ungar, University of Pennsylvania, Computing and Information Science. This talk describes two new ways of combining clustering and
prediction to discover latent structure and build predictive models for text
and database mining. A new clustering method, CPCM, learns to optimally
weight different features in a distance metric by finding "predictable"
clusters (e.g. of people, products or documents). Streamwise feature
selection dynamically decides which features to test for inclusion in a
predictive model, allowing search over billions of potential features such
as words, products, people, and their clusters and interactions. Streamwise
feature search is particularly important in searching over relational data,
where generating features is more expensive than testing them. |
| May 12 |
Information Theoretic Analysis of Biological Data Noam Slonim, Princeton University In recent years, researchers have been facing a rapid increase in the available biological data. These data come in a variety of forms - complete genome sequences, mRNA transcriptional profiles, protein-protein interactions, and so forth. Automatic data analysis methods are often the only route for extracting meaningful insights into these data. Existing techniques, however, typically employ nontrivial assumptions. These assumptions might be explicit, as in assuming a specific model which reflects one's prior beliefs about the data; or implicit, as in arbitrarily specifying a correlation or a ``similarity'' measure which lies at the core of any further analysis. While it is clear that such assumptions should be avoided, the conventional wisdom is that in practice they are actually unavoidable. In this talk I will describe an information theoretic framework that allows to extract biologically important insights without any prior assumptions about the nature of the data for a wide variety of problems. I will briefly discuss several recent applications of this approach, and will present in more detail results for systematic genotype-phenotype association in bacteria and archaea. |
See also the AI graduate study brochure.
Please contact any of the faculty below if you'd like to give a talk this semester. We especially encourage graduate students to sign up!
Back to CS course websites