Advanced Algorithms

Computer Science 683
Spring 2001

Instructor

Jon Kleinberg

5134 Upson.

255-3600.

Overview

CS 683 is a new course designed to serve as an advanced follow-up to CS 681. The goal is to cover recent results and current problems, and illustrate a number of open directions of research in algorithms. The course pre-requisite is CS 681 or equivalent background in algorithms and discrete mathematics.

The course focuses on an inter-related collection of topics centered around randomization, graph decomposition, and methods from high-dimensional geometry. It concentrates on both fundamental techniques and their applications. Techniques include linear programming duality, metric embeddings, random walks, random sampling, spectral partitioning, and spectral clustering. Applications include graph partitioning and its role in approximation algorithms; heuristics for routing; approximate sampling and counting; and high-dimensional clustering and indexing.

Course Outline

(1) Brief introduction to linear programming and duality.

Linear programming is a powerful algorithmic tool, but we'll start by focusing on an aspect of linear programming that overlaps only partially with algorithmic concerns: the duality theorem. Duality is a powerful proof technique; it originates from the question, "How do we prove that a system of linear inequalities has no solution?"

Lecture 1.

Lecture 2.

Lecture 3.

(2) Graph partitioning via multicommodity flow.

Graph partitioning is a central, if only loosely defined, problem in algorithms: in applications ranging from VLSI design to physical simulation to data mining, one often needs to divide a complicated network into pieces of roughly equal size, connected by relatively few edges. Perhaps due to its generality, graph partitioning lies at the heart of many of the issues we'll see in this course, and a strikingly large array of techniques have been applied to studying it. We start by considering approaches based on multicommodity flow.

Tom Leighton and Satish Rao. Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms. Journal of the ACM Volume 46 , No. 6 (Nov. 1999) Pages 787 - 832

Lecture 4.

Lecture 5.

Lecture 6.

(3) Multicommodity flow and the sparsest cut problem.

We now go further into the structure of the multicommodity flow problem. There is a natural ``cut condition'' that clearly limits the maximum amount of flow we can route between multiple terminals in a graph; but how close are the bounds we derive from these cuts to the optimal flow value? In some special cases, elegant graph-theoretic arguments show that the cut condition exactly characterizes the maximum flow value; for the general case, the relationship among these quantities can be related to results on the geometric embedding of finite metric spaces.

H. Okamura and P.D. Seymour. Multicommodity Flows in Planar Graphs. Journal of Combinatorial Theory, Series B 31(1981), pp. 75-81.

N. Linial, E. London and Yu. Rabinovich The geometry of graphs and some of its algorithmic applications. Combinatorica (1995) 15, pp. 215-245

Y. Aumann and Y. Rabani. An O(log k) Approximate Min-Cut Max-Flow Theorem and Approximation Algorithm. SIAM J. Comput., 27(1):291--301, February 1998. (Alternate link.)

Lecture 7.

Lecture 8.

Lecture 9.

Lecture 10.

(4) Eigenvalues and expansion.

There is another powerful perspective from which we can approach the graph partitioning problem: by looking at the eigenvalues and eigenvectors of a graph's adjacency matrix. A strong separation among the eigenvalues can be shown to imply the lack of sparse cuts in a graph. The analysis of this phenomenon leads to new types of geometric embedding questions; in the special case of planar graphs, we can use an actual plane embedding of the underlying graph to study properties of its eigenvalues.

N. Alon. Eigenvalues and Expanders. Combinatorica 6(1986), pp. 83-96.

A. Sinclair and M. Jerrum. Approximate Counting, Uniform Generation, and Rapidly Mixing Markov Chains. Information and Computation 82(1989), pp. 93-133.

Daniel A. Spielman and Shang-Hua Teng. Spectral Partitioning Works: Planar graphs and finite element meshes. Proceedings of the 37th Annual IEEE Conference on Foundations of Computer Science, 1996. and UC Berkeley Technical Report number UCB CSD-96-898.

F.R.K. Chung. Spectral Graph Theory. AMS Press. 1997.

(5) Random walks, rapid mixing, and approximate counting.

Random walks are a topic of basic interest in combinatorics; they also show up, implicitly or explicitly, in a number of seemingly different problems: space-efficient computation, the dynamics of card-shuffling, the time-evolution of physical systems, and the design of heuristics for approximate counting. We will focus on the latter of these issues, showing how the ability to sample from a large underlying state space can lead to provably good approximation algorithms for enumeration problems. The analysis of these algorithms will expose a number of connections between random walks and eigenvector methods: the stationary distribution of a random walk corresponds to a principal eigenvector, and the rate at which the walk mixes is related to its non-principal eigenvalues.

L. Lovasz. Random Walks on Graphs: A Survey. Combinatorics: Paul Erdos is Eighty (vol. 2), 1996, pp. 353-398.

Mark Jerrum and Alistair Sinclair, The Markov chain Monte Carlo method: an approach to approximate counting and integration. in Approximation Algorithms for NP-hard Problems, D.S.Hochbaum ed., PWS Publishing, Boston, 1996.

Mark Jerrum and Alistair Sinclair. Approximating the Permanent. SIAM J. Computing 18(1989), pp. 1149-1178.

See also Mark Jerrum, Alistair Sinclair, Eric Vigoda A polynomial-time approximation algorithm for the permanent of a matrix with non-negative entries. ECCC Report TR00-079 (2000).

(6) Random sampling and the VC-dimension.

We return to random sampling methods, and try to understand the following phenomenon: when the underlying space being sampled has a certain kind of ``orderly'' structure, very small samples can provide extremely strong information. There is a general and powerful theory, the notion of VC-dimension, that explains this; after developing the concept of VC-dimension, we'll look at some of its applications in approximation algorithms and computational geometry. and efficient indexing.

D. Haussler and E. Welzl. Epsilon-nets and simplex range queries. Discrete and Computational Geometry, 2 (1987), 127--151.

H. Bronnimann and M.T. Goodrich. Almost Optimal Set Covers in Finite VC-Dimension. Preliminary version appeared in Proc. 10th ACM Symp. on Computational Geometry (SCG), 1994, 293-302.

J. Kleinberg. Detecting a Network Failure. Proc. 41st IEEE Symposium on Foundations of Computer Science, 2000.

Lecture 19.

Lecture 20.

Lecture 21.

(7) High-dimensional nearest neighbor algorithms for indexing and classification.

Representing data as points in a high-dimensional space, so as to use geometric methods for indexing, is an algorithmic technique with a wide array of uses. It is central to a number of areas such as information retrieval, pattern recognition, and statistical data analysis; many of the problems arising in these applications can involve several hundred or several thousand dimensions. We investigate here how techniques based on low-dimensional projections and the VC-dimension can lead to heuristics for the nearest-neighbor problem; we also discuss a probabilistic model for classification problems in which one can argue rigorously that classification based on nearest neighbors is a good idea.

J. Kleinberg. Two algorithms for nearest-neighbor search in high dimensions. Proc. 29th ACM Symposium on Theory of Computing, 1997.

Eyal Kushilevitz, Rafail Ostrovsky, Yuval Rabani. Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces. SIAM J. Comput. 30(2): 457-474 (2000)

P. Indyk, R. Motwani. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. Proc. 30th ACM Symposium on Theory of Computing, 1998.

T.M. Cover, P.E. Hart. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1967), pp. 21--27.

Lecture 22.

Lecture 23.

Lecture 24.

(8) The singular value decomposition and high-dimensional clustering.

The relation between eigenvectors and clustering already appears in previous topics in the course; but when the matrices involved are derived from a collection of text documents, this leads to interesting heuristics for information retrieval. Similarly, when we construct matrices to represent a hypertext corpus, we obtain a family of ranking algorithm for pages on the World Wide Web.

Deerwester, S., Dumais, S. T., Landauer, T. K., Furnas, G. W. and Harshman, R. A. Indexing by latent semantic analysis. Journal of the Society for Information Science, 41(6), 391-407 (1990).

J. Kleinberg. Authoritative sources in a hyperlinked environment. Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998. Extended version in Journal of the ACM 46(1999). Also appears as IBM Research Report RJ 10076, May 1997.

S. Brin and L. Page. The Anatomy of a Large-Scale Hypertextual Web Search Engine. Proc. 7th International World Wide Web Conference, 1998.

C. H. Papadimitriou, P. Raghavan, H. Tamaki, S. Vempala. Latent semantic indexing: A probabilistic analysis. ACM Principles of Database Systems, 1998.

Y. Azar, A. Fiat, A. Karlin, F. McSherry, J. Saia. Spectral analysis of data. Proc. 33rd ACM Symposium on Theory of Computing, 2001.

P. Drineas, Ravi Kannan, Alan Frieze, Santosh Vempala and V. Vinay Clustering in large graphs and matrices. Proc. of the 10th ACM-SIAM Symposium on Discrete Algorithms, Baltimore, 1999.

Lecture 25.

Lecture 26.

Lecture 27.

Coursework

The work for the course consists of four problem sets, and the periodic writing of scribe notes.

LaTeX template for scribe notes.

Problem Set 1.

Problem Set 2.

Problem Set 3.