CS 772: Artificial Intelligence Seminar

Fall 2002
Friday 12:00-1:15
Upson 5130

Lunch will be served starting at noon, with talks beginning at 12:15 to account for classes ending at 12:05, and ending by 1:15 to account for 1:25 classes. The new format is designed to allow AI chit-chat before the talks begin. Also, we're trying to make some of the presentations less formal so that students and faculty will feel comfortable using the seminar to give presentations about work in progress or practice talks for conferences.

******************************************************************
The AI seminar is sponsored by the Intelligent Information Systems
Institute (IISI), Computing and Information Science, Cornell.
******************************************************************

Schedule

Date

Speaker/Title/Abstract/Host

September 6 ***no class***

September 13 ***no class - CP 2002***

September 20
Alex Niculescu Mizil
Convolutional networks for video-sequence classification
Convolutional networks are a special neural network architecture designed to be used with inputs that have special relations between them, like pixels of an image. Among other advantages, this architecture provides some shift invariance. This talk will describe the work I've done at NECI over the summer. The task was to classify video sequences using convolutional networks. We captured from TV four-second sequences from baseball, golf, auto racing, soccer, football, basketball and bicycle racing. At this point the network is trained to classify still images. To classify a video sequence we vote the predictions for each frame. I will briefly describe the process of getting, labeling and cleaning the data. I will also give a short overview of convolutional networks. This work is far from being complete.
Joint work with Yan LeCun, Leon Bottou, and Steve Lawrence.
Host: Rich

September 27 Jean-Charles Regin, ILOG
Host: Carla

October 4 Dan Huttenlocher
Pictorial Structures for Object Recognition

In this talk I will present a statistical framework for modeling the
appearance of objects in images. The basic idea is to represent an
object by a collection of parts arranged in a deformable
configuration. The appearance of each part is modeled separately, and
the deformable configuration is represented by spring-like connections
between pairs of parts. These models allow for qualitative
descriptions of objects, and are suitable for recognizing generic
classes of objects such as people, animals or human faces. While
similar representations were proposed over 30 years ago, practical
recognition methods using such representations have proven difficult
to develop. One of the main contributions of our work is the
introduction of efficient algorithms both for recognition and for
learning object models from examples.

This is joint work with Pedro Felzenszwalb of MIT.

Host: Dan (and Rich)

October 11 ***no class***

October 18 no AI seminar scheduled:
Josh Tenenbaum of MIT at the Psychology colloquium, 3:30pm 202 Uris Hall.
Bayesian Models of Human Learning and Reasoning
How can people learn the meaning of a new word from just a few examples? What makes a set of examples more or less representative of a concept? What makes two examples of a category seem more or less similar? Why are some generalizations apparently based on all-or-none rules while others appear to be based on gradients of similarity? I will describe an approach to explaining these aspects of everyday induction in terms of rational statistical inference. In our Bayesian models, learning and reasoning are explained in terms of probability computations over a hypothesis space of possible concepts, word meanings, or generalizations. The structure of the learner's hypothesis spaces reflects their domain-specific prior knowledge, while the nature of the probability computations depends on domain-general statistical principles. The hypotheses can be thought of as either potential rules for abstraction or potential features for similarity, with the shape of the learner's posterior probability distribution determining whether generalization appears more rule-based or similarity-based. Bayesian models thus offer an alternative to classical accounts of learning and reasoning that rest on a single route to knowledge -- domain-general statistics or domain-specific constraints -- or a single representational paradigm -- abstract rules or exemplar similarity. This talk will illustrate the Bayesian approach to modeling learning and reasoning on a range of behavioral case studies, and contrast its explanations with those of more traditional process models.

October 25 Bo Pang
Extracting Paraphrases from Multiple Translations
Paraphrases are basically different verbalizations of the same "meaning" and the ability to identify or generate paraphrases may be very helpful for such NLP tasks as multi-document summarization, question answering or machine translation. In this talk, I'll present a method that automatically extracts paraphrase information from multiple English translations (of the same Chinese source), based on the parse trees of those parallel English sentences. I'll also present some initial results, followed by a discussion of evaluation issues and potential applications.
This talk is based on summer work at ISI with Kevin Knight and Daniel Marcu.
Host: Lillian

November 1 Yannis Vetsikas
Bidding Strategies and Auction Mechanisms for Rational Agents with Combinatorial Preferences

Host: Bart

November 8 Eric Breck
Multi-Perspective Question Answering

Tomorrow's question answering systems will need to have the ability to process information about beliefs, opinions, and evaluations -- the perspective of an agent. Answers to many simple factual questions are affected by the perspective of the information source.

This summer, I participated in an exploratory project investigating multiple perspectives in question answering. This project was conducted as a summer workshop funded by the Northeast Regional Research Center of the Advanced Research and Development Agency (ARDA's NRRC). In this talk, I will motivate this research, describe an annotation framework for this sort of information, and offer some initial machine learning results.

Joint work with: Janyce Wiebe, Chris Buckley, Claire Cardie, Paul Davis, Bruce Fraser, Diane Litman, David Pierce, Ellen Riloff, Theresa Wilson

Host: Claire

November 15 Matthew Schultz
Learning a Distance Metric from Relative Comparisons

In this talk I will present a preliminary method for learning a distance metric from relative comparisons (e.g. "X is closer to Y than Z.") Our method learns a weighted, Euclidean distance function by transforming these comparisons into constraints for SVMs. We will also talk about a variety of potential applications from text clustering and digital libraries to segmenting consumers and collaborative filtering. Finally, we will report some early empirical results obtained from applying our method to a few data sets. This is joint work with Thorsten Joachims, and is a work in progress.

Host: Thorsten

November 22 ***no AI seminar because of ACSU student-faculty lunch***

November 29 ***Thanksgiving Break***

December 6 Jordan Erenrich
An Empirical Determination of Multitask Learning's Best Performance

In many machine learning problems, performance is increased by
learning several related tasks in parallel. For example, one might
increase the accuracy of an Ithaca rain prediction model by
simultaneously training it to forecast humidity and cloud
cover. However, while empirical studies indicate the relative value of
such additional tasks on real-world problems, it is unclear if current
algorithms optimally exploit the information contained in the related
tasks. This uncertainty stems from the difficulty in assessing how
real tasks are related

In this talk, we construct a "best case" class of synthetic multitask
learning problems with artificial neural networks, where the
relationship between tasks is well defined and the learning
procedure's hypothesis space includes the correct generating model.
We characterize the performance of backpropagation algorithms as a
function of the number of related tasks and measure the maximum MTL benefit in this best case. Our work concludes with a simulation of an infinite number of related tasks and measurements of the shortcomings of backpropagation on our MTL domain.

Host: Rich

See also the AI graduate study brochure.

Please contact any of the faculty below if you'd like to give a talk this semester. We especially encourage graduate students to sign up!

CS772, Fall '02
Claire Cardie
Rich Caruana
Joe Halpern
Thorsten Joachims
Lillian Lee
Bart Selman
Golan Yona

Back to CS course websites

Date	Speaker/Title/Abstract/Host
September 6	*no class*
September 13	*no class - CP 2002*
September 20	Alex Niculescu Mizil Convolutional networks for video-sequence classification Convolutional networks are a special neural network architecture designed to be used with inputs that have special relations between them, like pixels of an image. Among other advantages, this architecture provides some shift invariance. This talk will describe the work I've done at NECI over the summer. The task was to classify video sequences using convolutional networks. We captured from TV four-second sequences from baseball, golf, auto racing, soccer, football, basketball and bicycle racing. At this point the network is trained to classify still images. To classify a video sequence we vote the predictions for each frame. I will briefly describe the process of getting, labeling and cleaning the data. I will also give a short overview of convolutional networks. This work is far from being complete. Joint work with Yan LeCun, Leon Bottou, and Steve Lawrence. Host: Rich
September 27	Jean-Charles Regin, ILOG Host: Carla
October 4	Dan Huttenlocher Pictorial Structures for Object Recognition In this talk I will present a statistical framework for modeling the appearance of objects in images. The basic idea is to represent an object by a collection of parts arranged in a deformable configuration. The appearance of each part is modeled separately, and the deformable configuration is represented by spring-like connections between pairs of parts. These models allow for qualitative descriptions of objects, and are suitable for recognizing generic classes of objects such as people, animals or human faces. While similar representations were proposed over 30 years ago, practical recognition methods using such representations have proven difficult to develop. One of the main contributions of our work is the introduction of efficient algorithms both for recognition and for learning object models from examples. This is joint work with Pedro Felzenszwalb of MIT. Host: Dan (and Rich)
October 11	*no class*
October 18	no AI seminar scheduled: Josh Tenenbaum of MIT at the Psychology colloquium, 3:30pm 202 Uris Hall. Bayesian Models of Human Learning and Reasoning How can people learn the meaning of a new word from just a few examples? What makes a set of examples more or less representative of a concept? What makes two examples of a category seem more or less similar? Why are some generalizations apparently based on all-or-none rules while others appear to be based on gradients of similarity? I will describe an approach to explaining these aspects of everyday induction in terms of rational statistical inference. In our Bayesian models, learning and reasoning are explained in terms of probability computations over a hypothesis space of possible concepts, word meanings, or generalizations. The structure of the learner's hypothesis spaces reflects their domain-specific prior knowledge, while the nature of the probability computations depends on domain-general statistical principles. The hypotheses can be thought of as either potential rules for abstraction or potential features for similarity, with the shape of the learner's posterior probability distribution determining whether generalization appears more rule-based or similarity-based. Bayesian models thus offer an alternative to classical accounts of learning and reasoning that rest on a single route to knowledge -- domain-general statistics or domain-specific constraints -- or a single representational paradigm -- abstract rules or exemplar similarity. This talk will illustrate the Bayesian approach to modeling learning and reasoning on a range of behavioral case studies, and contrast its explanations with those of more traditional process models.
October 25	Bo Pang Extracting Paraphrases from Multiple Translations Paraphrases are basically different verbalizations of the same "meaning" and the ability to identify or generate paraphrases may be very helpful for such NLP tasks as multi-document summarization, question answering or machine translation. In this talk, I'll present a method that automatically extracts paraphrase information from multiple English translations (of the same Chinese source), based on the parse trees of those parallel English sentences. I'll also present some initial results, followed by a discussion of evaluation issues and potential applications. This talk is based on summer work at ISI with Kevin Knight and Daniel Marcu. Host: Lillian
November 1	Yannis Vetsikas Bidding Strategies and Auction Mechanisms for Rational Agents with Combinatorial Preferences Host: Bart
November 8	Eric Breck Multi-Perspective Question Answering Tomorrow's question answering systems will need to have the ability to process information about beliefs, opinions, and evaluations -- the perspective of an agent. Answers to many simple factual questions are affected by the perspective of the information source. This summer, I participated in an exploratory project investigating multiple perspectives in question answering. This project was conducted as a summer workshop funded by the Northeast Regional Research Center of the Advanced Research and Development Agency (ARDA's NRRC). In this talk, I will motivate this research, describe an annotation framework for this sort of information, and offer some initial machine learning results. Joint work with: Janyce Wiebe, Chris Buckley, Claire Cardie, Paul Davis, Bruce Fraser, Diane Litman, David Pierce, Ellen Riloff, Theresa Wilson Host: Claire
November 15	Matthew Schultz Learning a Distance Metric from Relative Comparisons In this talk I will present a preliminary method for learning a distance metric from relative comparisons (e.g. "X is closer to Y than Z.") Our method learns a weighted, Euclidean distance function by transforming these comparisons into constraints for SVMs. We will also talk about a variety of potential applications from text clustering and digital libraries to segmenting consumers and collaborative filtering. Finally, we will report some early empirical results obtained from applying our method to a few data sets. This is joint work with Thorsten Joachims, and is a work in progress. Host: Thorsten
November 22	*no AI seminar because of ACSU student-faculty lunch*
November 29	*Thanksgiving Break*
December 6	Jordan Erenrich An Empirical Determination of Multitask Learning's Best Performance In many machine learning problems, performance is increased by learning several related tasks in parallel. For example, one might increase the accuracy of an Ithaca rain prediction model by simultaneously training it to forecast humidity and cloud cover. However, while empirical studies indicate the relative value of such additional tasks on real-world problems, it is unclear if current algorithms optimally exploit the information contained in the related tasks. This uncertainty stems from the difficulty in assessing how real tasks are related In this talk, we construct a "best case" class of synthetic multitask learning problems with artificial neural networks, where the relationship between tasks is well defined and the learning procedure's hypothesis space includes the correct generating model. We characterize the performance of backpropagation algorithms as a function of the number of related tasks and measure the maximum MTL benefit in this best case. Our work concludes with a simulation of an infinite number of related tasks and measurements of the shortcomings of backpropagation on our MTL domain. Host: Rich

CS 772: Artificial Intelligence Seminar

Fall 2002 Friday 12:00-1:15 Upson 5130

Fall 2002
Friday 12:00-1:15
Upson 5130