CS 789 THEORY SEMINAR [home]
Speaker: Mark Sandler
Affiliation: Cornell University
Date: May 3, 2004
Title: Using Mixture
Models for Collaborative Filtering.
Abstract:
Given information about different user purchases on the e-commerce website, is it possible to give individual advice for every user?
How much information do we need for that? What would be the guaranteed quality of recommendation?
We develop recommendation algorithms with provable performance guarantees in a framework of probabilistic mixture model for proposed by Hoffman and Puzicha. We identify certain novel parameters of mixture models that are closely connected with the best achievable performance of a recommendation algorithm; we show that for any system in which these parameters are bounded, it is possible to give recommendations whose quality converges to optimal as the amount of data grows.
All our bounds depend on a special measure of independence that can be viewed as an $L_1$-analogue of the smallest singular value of a matrix.
Using this, we introduce a technique based on generalized pseudoinverse matrices and for handling sets of high-dimensional vectors.
We also show that standard approaches based on $L_2$ spectral methods are not strong enough to yield comparable results, thereby suggesting some inherent limitations of spectral analysis.
This is joint work with Jon Kleinberg