Department of Computer Science Colloquium

Tuesday March 26th, 2002 at 4:15pm

Upson Hall B17

**Algorithms
for Probabilistic Inference and Learning on Graphs
**

Division of Computer Science and Department of Statistics

University of California, Berkeley

I
will give a talk in three parts.

In the first part, I describe a hierarchical, latent variable graphical model for collections of discrete data. The model posits that each "document" in the collection is generated as a mixture of "topics", where the continuous-valued mixture proportions are distributed as a latent Dirichlet variable. Inference and learning are carried out efficiently via variational algorithms. I present empirical results on applications to text modeling, collaborative filtering, text classification, and text/image modeling.

In
the second part of the talk, I present a class of algorithms for Independent
Component Analysis (ICA) which use contrast functions based on canonical
correlations in a reproducing kernel Hilbert space. On the one hand, these contrast functions are related to mutual
information and have desirable mathematical properties as measures of
statistical dependence. On the
other hand, building on recent developments in kernel methods, these criteria
and their derivatives can be computed efficiently.
Minimizing these criteria leads to flexible and robust algorithms for ICA.
I illustrate with simulations involving a wide variety of source
distributions, showing that our algorithms outperform presently known
algorithms.

The
final part of the talk will focus on a new spectral clustering algorithm.
The algorithm can be analyzed using tools from matrix perturbation
theory, yielding conditions under which it can be expected to perform well.
Experimental results are presented on a number of challenging clustering
problems.

[Joint
work with Francis Bach, David Blei, Andrew Ng and Yair Weiss].