Department of Computer Science Colloquium
Thursday February 14th, 2002 at 4:15pm 
Upson Hall B17

New Ensemble and Kernel Methods for Machine Learning


John Lafferty
School of Computer Science
, Carnegie Mellon University

                                                                           http://www-2.cs.cmu.edu/~lafferty/

Many data analysis problems have a natural discrete structure that can be expressed in terms of data on a graph or as rankings of items. Examples include the analysis of biological molecules, natural language parses, the ranked list of pages output by a Web search engine, and the symptoms a physician enters on the chart of a patient.  Yet many approaches to machine learning and statistical analysis ignore this structure, simply representing data in some way as scores or as vectors in Euclidean space. We present two new methods that directly take advantage of the underlying structure. The first is an approach to building kernels on graphs that can be viewed as discrete analogues of Gaussian kernels in Euclidean space. The second is an ensemble method that combines predictors by viewing them as generating rankings of items, leading to statistical models on the permutation group. Both methods show empirical promise and interesting theoretical properties. Joint work with Imre Kondor and Guy Lebanon.