CS 789 THEORY SEMINAR [home]


Speaker: Chaitanya Swamy
Affiliation: Computer Science, Cornell University
Date: Monday, November 24, 2003
Title: Correlation Clustering: Maximizing Agreements via Semidefinite Programming

Abstract:

We consider the Correlation Clustering problem introduced by Bansal, Blum and Chawla (FOCS '02). We are given a graph G=(V,E) whose edges are labeled + or -. An edge (u,v) represents a comparison between the objects (e.g., documents) represented by nodes u and v; a + label indicates that the objects are similar and a - label indicates that they are different. The goal is to partition the vertices into clusters so that the + edges lie within clusters and the - edges lie between clusters. We consider the problem of maximizing Agreements which is defined as the total number of + edges that lie within clusters and - edges that go across clusters. We give a 0.766-approximation algorithm for this problem. The algorithm works even if we have a multigraph with weights on the edges, and the goal is to maximize the total weight of the + edges within clusters and the - edges between clusters. This answers some open problems posed by Bansal et al. (FOCS '02). Previously the only results known were a trivial 0.5-approximation for the weighted case, and a PTAS for the case when |E| is Omega(|V|^2).

Our techniques are different than those used by Bansal et al. Our algorithm is based on rounding the solution to a semidefinite relaxation of the problem by extending the Goemans-Williamson hyperplane rounding technique. In this talk I will describe a 0.75-approximation algorithm for the problem that is simple and easy to analyze. As a by product of this algorithm we also get an approximation algorithm for the k-Clustering variant of the problem where we limit the number of clusters to be at most k.