Artificial Intelligence Seminar

Fall 2005
Friday 12:00-1:15
Upson 5130

Sponsored by the Intelligent Information Systems Institute (IISI),
Computing and Information Science, Cornell

The AI seminar meets weekly for lectures by graduate students, faculty, and researchers emphasizing work-in-progress and recent results in AI research. Lunch will be served starting at noon, with the talks running between 12:15 and 1:15. The new format is designed to allow AI chit-chat before the talks begin. Also, we're trying to make some of the presentations less formal so that students and faculty will feel comfortable using the seminar to give presentations about work in progress or practice talks for conferences.

Date
Title/Speaker/Abstract/Host

September 9 A Support Vector Method for Multivariate Performance Measures
Thorsten Joachims
Depending on the application, measuring the success of a learning algorithm requires application specific performance measures. In text classification, for example, F1-Score and Precision/Recall Breakeven Point are used to evaluate classifier performance while error rate is not suitable due to a large imbalance between positive and negative examples. However, most learning methods optimize error rate, not the application specific performance measure, which is likely to produce suboptimal results. How can we learn rules that optimize measures other than error rate?

This talk presents a Support Vector Method for optimizing multivariate non-linear performance measures like the F1-score. Taking a multivariate prediction approach, we give an algorithm with which such multivariate SVMs can be trained in polynomial time for large classes of potentially non-linear performance measures, in particular ROC-Area and all measures that can be computed from the contingency table. The conventional classification SVM arises as a special case of our method.

September 16 PageRank without Hyperlinks: Structural Re-ranking Using Links Induced by Language Models
Oren Kurland
Inspired by the PageRank and HITS (hubs and authorities) algorithms for Web search, we present a structural re-ranking approach to ad hoc information retrieval: we reorder the documents in an initially retrieved set by exploring asymmetric relationships between them. Specifically, we consider generation links, which indicate that the language model induced from one document assigns high probability to the text of another. We present a number of re-ranking criteria based on measures of centrality in the graphs formed by generation links, and show that integrating centrality into standard language-model-based retrieval is quite effective at improving precision at top ranks.

This is joint work with Lillian Lee.

September 23

Multi-Perspective Question Answering Using the OpQA Corpus

Ves Stoyanov
In comparison to fact-based question answering (QA), researchers understand far less about the properties of questions and answers in this area of multi-perspective question answering (MPQA). We first present the OpQA corpus of opinion questions and answers. Using the corpus, we compare and contrast the properties of fact and opinion questions and answers. Based on the disparate characteristics of opinion vs. fact answers, we argue that traditional fact-based QA approaches may have difficulty in an MPQA setting without modification. As an initial step towards the development of MPQA systems, we investigate the use of machine learning and rule-based subjectivity and opinion source filters and show that they can be used to guide MPQA systems.

This will be a practice talk for a presentation that I will be giving at the HLT-EMNLP conference in October.

September 30 Identifying Sources of Opinions with Conditional Random Fields and Extraction Patterns
Yejin Choi
In recent years, there has been a great deal of interest in methods for sentiment classification, and opinion analysis (e.g., detecting polarity and strength). We pursue another aspect of opinion analysis: automatically identifying direct and indirect sources of opinions, emotions, and sentiments. Identifying sources of opinions is critical for opinion-oriented question-answering systems (e.g., systems that answer questions of the form ``How does [X] feel about [Y]?''), and opinion-oriented summarization systems, where the system needs to distinguish the opinions of one source from those of another. We view this problem as an information extraction task and tackle the problem using sequence tagging and pattern matching techniques simultaneously. In particular, we use Conditional Random Fields [J. Lafferty et al., 2001] and a variation of AutoSlog [E. Riloff, 1996]. By combining two seemingly very different approaches, and further applying feature induction, our resulting system identifies opinion sources with 81.2% precision and 60.6% recall using an overlap measure.

This will be a practice talk for a presentation that I will be giving at the HLT-EMNLP conference in October.

Joint work with Claire Cardie.

October 7 Beyond Trees: Common-Factor Model for 2D Human Pose Recovery
Xiangyang Lan
Tree structured models have been widely used for determining the pose of a human body, from either 2D or 3D data. While such models can effectively represent the kinematic constraints of the skeletal structure, they do not capture additional constraints such as coordination of the limbs. Tree structured models thus miss an important source of information about human body pose, as limb coordination is necessary for balance while standing, walking, or running, as well as being evident in other activities such as dancing and throwing. In this paper we consider the use of undirected graphical models that augment a tree structure with latent variables in order to account for coordination between limbs. We refer to these as common-factor models, since they are constructed by using factor analysis to identify additional correlations in limb position that are not accounted for by the kinematic tree structure. These common-factor models have an underlying tree structure and thus a variant of the standard Viterbi algorithm for a tree can be applied for efficient estimation. We present some experimental results contrasting common-factor models with tree models, and quantify the improvement in pose estimation for 2D image data.

This is joint work with Professor Dan Huttenlocher.

October 14 Optimizing to Arbitrary NLP Metrics using Ensemble Selection
Art Munson
While there have been many successful applications of machine learning methods to tasks in NLP, learning algorithms are not typically designed to optimize NLP performance metrics. This work evaluates an ensemble selection framework designed to optimize arbitrary metrics and automate the process of algorithm selection and parameter tuning. We report the results of experiments that instantiate the framework for three NLP tasks, using six learning algorithms, a wide variety of parameterizations, and 15 performance metrics. Based on our results, we make recommendations for subsequent machine learning-based research for natural language learning.

Joint work with Claire Cardie and Rich Caruana. Work presented as poster last week at HLT/EMNLP 2005.

October 21 MRF's for MRI's: Bayesian Reconstruction of MR Images via Graph Cuts
Ramin Zabih
Markov Random Fields (MRF's) are a very effective way to impose spatial smoothness in computer vision. I will describe an application of MRF's to a non-traditional but important problem in medical imaging: the reconstruction of MR images from raw fourier data. This can be formulated as a linear inverse problem, where the goal is to find a spatially smooth solution while permitting discontinuities. Although it is easy to apply MRF's for MR reconstruction, the resulting energy minimization problem poses some interesting challenges. It lies outside of the class of energy functions that can be straightforwardly minimized with graph cuts. I will show how graph cuts can nonetheless be adapted to solve this problem, and demonstrate some
preliminary results that are extremely promising.

October 28 Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales
Bo Pang
We address the rating-inference problem, wherein rather than simply decide whether a review is ``thumbs up'' or ``thumbs down'', as in previous sentiment analysis work, one must determine an author's evaluation with respect to a multi-point scale (e.g., one to five ``stars''). This task represents an interesting twist on standard multi-class text categorization because there are several different degrees of similarity between class labels; for example, ``three stars'' is intuitively closer to ``four stars'' than to ``one star''.

We first evaluate human performance at the task. Then, we apply a meta-algorithm, based on a metric labeling formulation of the problem, that alters a given n-ary classifier's output in an explicit attempt to ensure that similar items receive similar labels. We show that the meta-algorithm can provide significant improvements over both multi- class and regression versions of SVMs when we employ a novel similarity measure appropriate to the problem.

Joint work with Lillian Lee.

November 4 Convex Hidden Markov Models
Dale Schuurmans, Department of Computing Science, University of Alberta
In this talk, I will discuss a new unsupervised algorithm for training hidden Markov models that is convex and avoids the use of EM. The idea is to formulate an unsupervised version of maximum margin Markov networks (M3Ns) that can be trained via semidefinite programming. This extends our recent work on unsupervised support vector machines. The result is a discriminative training criterion for hidden Markov models that remains unsupervised and does not create local minima. Experimental results show that the convex discriminative procedure can produce better conditional models than conventional Baum-Welch (EM) training.

Joint work with Linli Xu. (We also acknowledge the generous assistance of Li Cheng and Tao Wang.)

November 11 Query Expansion Using Random Walk Models
Kevyn Collins-Thompson, School of Computer Science, Carnegie Mellon University
Query expansion is a widely-used information retrieval technique that usually improves search performance on average, but which can also significantly hurt performance for specific queries. A desirable goal is therefore to investigate more robust expansion algorithms which can reduce these worst-case scenarios without significantly hurting overall precision.

We describe one method that uses a Markov chain framework to combine multiple sources of knowledge on term associations. The stationary distribution of the model is used to obtain probability estimates that a potential expansion term reflects multiple aspects of the original query. We use this model to re-weight query expansion terms and find that robustness is improved with little negative effect on mean average precision. Some statistically significant differences in accuracy were also observed depending on the weighting of evidence in the random walk. For example, using co-occurrence data later in the walk was generally better than using it early, suggesting further improvements in effectiveness may be possible by learning walk behaviors.

November 18 No AI Seminar (ACSU Lunch)

November 25 Thanksgiving break

December 2 Global Inference in Learning for Natural Language Processing
Dan Roth, Department of Computer Science, University of Illinois at Urbana-Champaign
Natural language decisions often involve assigning values to sets of variables where complex and expressive dependencies can influence, or even dictate, what assignments are possible. Dependencies may range from simple statistical correlations to those that are constrained by deeper structural, relational and semantic properties of the text.

I will describe research on a framework that combines learning and inference for this problem, of inferring structured and constrained output. The inference process of assigning globally optimal values to mutually dependent variables is formalized as an optimization problem and is solved as an integer linear programming (ILP) problem. Several key issues will be discussed, including the incorporation of both statistical and declarative constraints and training paradigms.

The work will be described in the context of the Semantic Role Labeling tasks, inferring a shallow semantic analysis of sentences at the level of "who did what to whom, how, when and why".

See also the AI graduate study brochure.

Please contact any of the faculty below if you'd like to give a talk this semester. We especially encourage graduate students to sign up!

CS772, Fall '05
Claire Cardie
Rich Caruana
Carla Gomes
Joe Halpern
Dan Huttenlocher
Thorsten Joachims
Lillian Lee
Bart Selman
Golan Yona
Ramin Zabih

Back to CS course websites

Date	Title/Speaker/Abstract/Host
September 9	A Support Vector Method for Multivariate Performance Measures Thorsten Joachims Depending on the application, measuring the success of a learning algorithm requires application specific performance measures. In text classification, for example, F1-Score and Precision/Recall Breakeven Point are used to evaluate classifier performance while error rate is not suitable due to a large imbalance between positive and negative examples. However, most learning methods optimize error rate, not the application specific performance measure, which is likely to produce suboptimal results. How can we learn rules that optimize measures other than error rate? This talk presents a Support Vector Method for optimizing multivariate non-linear performance measures like the F1-score. Taking a multivariate prediction approach, we give an algorithm with which such multivariate SVMs can be trained in polynomial time for large classes of potentially non-linear performance measures, in particular ROC-Area and all measures that can be computed from the contingency table. The conventional classification SVM arises as a special case of our method.
September 16	PageRank without Hyperlinks: Structural Re-ranking Using Links Induced by Language Models Oren Kurland Inspired by the PageRank and HITS (hubs and authorities) algorithms for Web search, we present a structural re-ranking approach to ad hoc information retrieval: we reorder the documents in an initially retrieved set by exploring asymmetric relationships between them. Specifically, we consider generation links, which indicate that the language model induced from one document assigns high probability to the text of another. We present a number of re-ranking criteria based on measures of centrality in the graphs formed by generation links, and show that integrating centrality into standard language-model-based retrieval is quite effective at improving precision at top ranks. This is joint work with Lillian Lee.
September 23	Multi-Perspective Question Answering Using the OpQA Corpus Ves Stoyanov In comparison to fact-based question answering (QA), researchers understand far less about the properties of questions and answers in this area of multi-perspective question answering (MPQA). We first present the OpQA corpus of opinion questions and answers. Using the corpus, we compare and contrast the properties of fact and opinion questions and answers. Based on the disparate characteristics of opinion vs. fact answers, we argue that traditional fact-based QA approaches may have difficulty in an MPQA setting without modification. As an initial step towards the development of MPQA systems, we investigate the use of machine learning and rule-based subjectivity and opinion source filters and show that they can be used to guide MPQA systems. This will be a practice talk for a presentation that I will be giving at the HLT-EMNLP conference in October.
September 30	Identifying Sources of Opinions with Conditional Random Fields and Extraction Patterns Yejin Choi In recent years, there has been a great deal of interest in methods for sentiment classification, and opinion analysis (e.g., detecting polarity and strength). We pursue another aspect of opinion analysis: automatically identifying direct and indirect sources of opinions, emotions, and sentiments. Identifying sources of opinions is critical for opinion-oriented question-answering systems (e.g., systems that answer questions of the form ``How does [X] feel about [Y]?''), and opinion-oriented summarization systems, where the system needs to distinguish the opinions of one source from those of another. We view this problem as an information extraction task and tackle the problem using sequence tagging and pattern matching techniques simultaneously. In particular, we use Conditional Random Fields [J. Lafferty et al., 2001] and a variation of AutoSlog [E. Riloff, 1996]. By combining two seemingly very different approaches, and further applying feature induction, our resulting system identifies opinion sources with 81.2% precision and 60.6% recall using an overlap measure. This will be a practice talk for a presentation that I will be giving at the HLT-EMNLP conference in October. Joint work with Claire Cardie.
October 7	Beyond Trees: Common-Factor Model for 2D Human Pose Recovery Xiangyang Lan Tree structured models have been widely used for determining the pose of a human body, from either 2D or 3D data. While such models can effectively represent the kinematic constraints of the skeletal structure, they do not capture additional constraints such as coordination of the limbs. Tree structured models thus miss an important source of information about human body pose, as limb coordination is necessary for balance while standing, walking, or running, as well as being evident in other activities such as dancing and throwing. In this paper we consider the use of undirected graphical models that augment a tree structure with latent variables in order to account for coordination between limbs. We refer to these as common-factor models, since they are constructed by using factor analysis to identify additional correlations in limb position that are not accounted for by the kinematic tree structure. These common-factor models have an underlying tree structure and thus a variant of the standard Viterbi algorithm for a tree can be applied for efficient estimation. We present some experimental results contrasting common-factor models with tree models, and quantify the improvement in pose estimation for 2D image data. This is joint work with Professor Dan Huttenlocher.
October 14	Optimizing to Arbitrary NLP Metrics using Ensemble Selection Art Munson While there have been many successful applications of machine learning methods to tasks in NLP, learning algorithms are not typically designed to optimize NLP performance metrics. This work evaluates an ensemble selection framework designed to optimize arbitrary metrics and automate the process of algorithm selection and parameter tuning. We report the results of experiments that instantiate the framework for three NLP tasks, using six learning algorithms, a wide variety of parameterizations, and 15 performance metrics. Based on our results, we make recommendations for subsequent machine learning-based research for natural language learning. Joint work with Claire Cardie and Rich Caruana. Work presented as poster last week at HLT/EMNLP 2005.
October 21	MRF's for MRI's: Bayesian Reconstruction of MR Images via Graph Cuts Ramin Zabih Markov Random Fields (MRF's) are a very effective way to impose spatial smoothness in computer vision. I will describe an application of MRF's to a non-traditional but important problem in medical imaging: the reconstruction of MR images from raw fourier data. This can be formulated as a linear inverse problem, where the goal is to find a spatially smooth solution while permitting discontinuities. Although it is easy to apply MRF's for MR reconstruction, the resulting energy minimization problem poses some interesting challenges. It lies outside of the class of energy functions that can be straightforwardly minimized with graph cuts. I will show how graph cuts can nonetheless be adapted to solve this problem, and demonstrate some preliminary results that are extremely promising.
October 28	Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales Bo Pang We address the rating-inference problem, wherein rather than simply decide whether a review is ``thumbs up'' or ``thumbs down'', as in previous sentiment analysis work, one must determine an author's evaluation with respect to a multi-point scale (e.g., one to five ``stars''). This task represents an interesting twist on standard multi-class text categorization because there are several different degrees of similarity between class labels; for example, ``three stars'' is intuitively closer to ``four stars'' than to ``one star''. We first evaluate human performance at the task. Then, we apply a meta-algorithm, based on a metric labeling formulation of the problem, that alters a given n-ary classifier's output in an explicit attempt to ensure that similar items receive similar labels. We show that the meta-algorithm can provide significant improvements over both multi- class and regression versions of SVMs when we employ a novel similarity measure appropriate to the problem. Joint work with Lillian Lee.
November 4	Convex Hidden Markov Models Dale Schuurmans, Department of Computing Science, University of Alberta In this talk, I will discuss a new unsupervised algorithm for training hidden Markov models that is convex and avoids the use of EM. The idea is to formulate an unsupervised version of maximum margin Markov networks (M3Ns) that can be trained via semidefinite programming. This extends our recent work on unsupervised support vector machines. The result is a discriminative training criterion for hidden Markov models that remains unsupervised and does not create local minima. Experimental results show that the convex discriminative procedure can produce better conditional models than conventional Baum-Welch (EM) training. Joint work with Linli Xu. (We also acknowledge the generous assistance of Li Cheng and Tao Wang.)
November 11	Query Expansion Using Random Walk Models Kevyn Collins-Thompson, School of Computer Science, Carnegie Mellon University Query expansion is a widely-used information retrieval technique that usually improves search performance on average, but which can also significantly hurt performance for specific queries. A desirable goal is therefore to investigate more robust expansion algorithms which can reduce these worst-case scenarios without significantly hurting overall precision. We describe one method that uses a Markov chain framework to combine multiple sources of knowledge on term associations. The stationary distribution of the model is used to obtain probability estimates that a potential expansion term reflects multiple aspects of the original query. We use this model to re-weight query expansion terms and find that robustness is improved with little negative effect on mean average precision. Some statistically significant differences in accuracy were also observed depending on the weighting of evidence in the random walk. For example, using co-occurrence data later in the walk was generally better than using it early, suggesting further improvements in effectiveness may be possible by learning walk behaviors.
November 18	No AI Seminar (ACSU Lunch)
November 25	Thanksgiving break
December 2	Global Inference in Learning for Natural Language Processing Dan Roth, Department of Computer Science, University of Illinois at Urbana-Champaign Natural language decisions often involve assigning values to sets of variables where complex and expressive dependencies can influence, or even dictate, what assignments are possible. Dependencies may range from simple statistical correlations to those that are constrained by deeper structural, relational and semantic properties of the text. I will describe research on a framework that combines learning and inference for this problem, of inferring structured and constrained output. The inference process of assigning globally optimal values to mutually dependent variables is formalized as an optimization problem and is solved as an integer linear programming (ILP) problem. Several key issues will be discussed, including the incorporation of both statistical and declarative constraints and training paradigms. The work will be described in the context of the Semantic Role Labeling tasks, inferring a shallow semantic analysis of sentences at the level of "who did what to whom, how, when and why".

Artificial Intelligence Seminar

Fall 2005 Friday 12:00-1:15 Upson 5130

Sponsored by the Intelligent Information Systems Institute (IISI), Computing and Information Science, Cornell

Fall 2005
Friday 12:00-1:15
Upson 5130

Sponsored by the Intelligent Information Systems Institute (IISI),
Computing and Information Science, Cornell