Many applications in data mining and machine
learning require the analysis of large twodimensional matrices. Depending
on the application, these matrices can have very different characteristics:
in text mining, cooccurrence matrices are large, sparse and nonnegative
while DNA microarray analysis yields smaller, dense matrices with positive
as well as negative entries. In data analysis, it is often desirable (a) to
find "lowparameter" matrix approximations, and (b) to "cocluster" such
matrices, i.e., simultaneously cluster rows as well as columns.
In this talk, I will discuss a framework that inextricably links a certain
class of matrix approximations with coclustering. The approximation error
can be measured using a nontrivial class of distortion measures, called
Bregman divergences, that have connections to the exponential family of
probability distributions. Our algorithms allow us to handle a wide variety
of matrices and distortion measures within a unifying framework, and are
able to efficiently construct good coclusterings and the corresponding
lowparameter matrix approximations. I will conclude by presenting
experimental results on text and microarray data. Extensions to tensors will
be briefly discussed.
