Deterministic pivoting algorithms for constrained ranking and clustering problems

 

 

Anke van Zuylen

Cornell University

Monday  April 30, 2007
4:00 PM, 5130 Upson Hall

 

Abstract:

We consider the problem of ranking or clustering a set of elements, based on input information for each pair of elements. The objective is to find a solution that minimizes the deviation from the input information. For example, we may want to cluster webpages based on similarity scores, where for each pair of pages we have a score between 0 and 1, and we want to find a clustering that minimizes the sum of the similarity scores of pages in different clusters plus the sum of (one minus the similarity score) for pages in the same cluster. Another example arises in meta-search engines for Web search, where we want to get robust rankings that are not sensitive to the various shortcomings and biases of individual search engines by combining the rankings of the individual search engines.

Randomized approximation algorithms for unconstrained versions of these problems were given by Ailon, Charikar, and Newman, and by Ailon and Charikar. We give deterministic approximation algorithms for these problems, thereby answering an open question of Ailon et al.

Our algorithms follow the paradigm of Ailon et al. of choosing a particular vertex as a pivot and partitioning the graph according to the pivot; unlike their algorithms, we do not choose the pivot randomly.

Additionally, our algorithms extend to constrained versions of these problems in which we have apriori constraints on the output. For instance for the ranking problem, we can have as input a partial order such that the output ordering must be a linear extension of this ordering.

This is joint work with Rajneesh Hegde, Kamal Jain and David P. Williamson.