Computer Science 683
CS 683 is a new course designed to serve as an
advanced follow-up to CS 681.
The goal is to cover recent results
and current problems, and illustrate
a number of open directions of research in algorithms.
The course pre-requisite is CS 681 or equivalent
background in algorithms and discrete mathematics.
The course focuses
on an inter-related collection of topics
centered around randomization, graph decomposition,
and methods from high-dimensional geometry.
It concentrates on both fundamental techniques
and their applications.
Techniques include linear programming duality,
metric embeddings, random walks, random sampling,
spectral partitioning, and spectral clustering.
Applications include graph partitioning
and its role in approximation algorithms;
heuristics for routing;
approximate sampling and counting;
and high-dimensional clustering and indexing.
(1) Brief introduction to linear programming and duality.
Linear programming is a powerful algorithmic tool,
but we'll start by focusing on an aspect of
linear programming that overlaps only
partially with algorithmic concerns:
the duality theorem.
Duality is a powerful proof technique;
it originates from the question,
"How do we prove that a system of linear
inequalities has no solution?"
(2) Graph partitioning via multicommodity flow.
Graph partitioning is a central,
if only loosely defined, problem in algorithms:
in applications ranging from VLSI design to
physical simulation to data mining,
one often needs to divide a complicated network
into pieces of roughly equal size,
connected by relatively few edges.
Perhaps due to its generality, graph
partitioning lies at the heart of many of
the issues we'll see in this course, and
a strikingly large array of techniques
have been applied to studying it.
We start by considering approaches based on
(3) Multicommodity flow and the sparsest cut problem.
We now go further into the structure of the
multicommodity flow problem.
There is a natural ``cut condition'' that
clearly limits the maximum amount of flow
we can route between multiple terminals
in a graph; but how close are the bounds
we derive from these cuts to the optimal flow value?
In some special cases, elegant graph-theoretic arguments
show that the cut condition exactly characterizes
the maximum flow value;
for the general case, the relationship among
these quantities can be related to results
on the geometric embedding of finite metric spaces.
(4) Eigenvalues and expansion.
There is another powerful perspective from which
we can approach the graph partitioning problem:
by looking at the eigenvalues and eigenvectors
of a graph's adjacency matrix.
A strong separation among the eigenvalues
can be shown to imply the lack of sparse cuts in a graph.
The analysis of this phenomenon
leads to new types of geometric embedding questions;
in the special case of planar graphs,
we can use an actual plane embedding of the underlying graph
to study properties of its eigenvalues.
Eigenvalues and Expanders.
Combinatorica 6(1986), pp. 83-96.
A. Sinclair and M. Jerrum.
Approximate Counting, Uniform Generation, and Rapidly Mixing Markov Chains.
Information and Computation 82(1989), pp. 93-133.
Daniel A. Spielman and Shang-Hua Teng.
Spectral Partitioning Works: Planar graphs and finite element meshes.
Proceedings of the 37th Annual IEEE
Conference on Foundations of Computer Science, 1996.
and UC Berkeley Technical Report number UCB CSD-96-898.
Spectral Graph Theory.
AMS Press. 1997.
(5) Random walks, rapid mixing, and approximate counting.
Random walks are a topic of basic interest in combinatorics;
they also show up, implicitly or explicitly, in a number
of seemingly different problems:
space-efficient computation, the dynamics of card-shuffling,
the time-evolution of physical systems, and
the design of heuristics for approximate counting.
We will focus on the latter of these issues,
showing how the ability to sample from a large
underlying state space can lead to provably good approximation
algorithms for enumeration problems.
The analysis of these algorithms will expose a
number of connections between random walks and
eigenvector methods: the stationary distribution
of a random walk corresponds to a principal eigenvector,
and the rate at which the walk mixes is
related to its non-principal eigenvalues.
Random Walks on Graphs: A Survey.
Combinatorics: Paul Erdos is Eighty (vol. 2), 1996, pp. 353-398.
Mark Jerrum and Alistair Sinclair,
The Markov chain Monte Carlo method:
an approach to approximate counting and integration.
in Approximation Algorithms for NP-hard Problems,
D.S.Hochbaum ed., PWS Publishing, Boston, 1996.
Mark Jerrum and Alistair Sinclair.
Approximating the Permanent.
SIAM J. Computing 18(1989), pp. 1149-1178.
- See also
Mark Jerrum, Alistair Sinclair, Eric Vigoda
A polynomial-time approximation algorithm for the permanent of
a matrix with non-negative entries.
ECCC Report TR00-079 (2000).
(6) Random sampling and the VC-dimension.
We return to random sampling methods, and try to understand the
when the underlying space being sampled has
a certain kind of ``orderly'' structure,
very small samples can provide extremely strong information.
There is a general and powerful theory,
the notion of VC-dimension, that explains this;
after developing the concept of VC-dimension,
we'll look at some of its applications
in approximation algorithms and computational geometry.
and efficient indexing.
(7) High-dimensional nearest neighbor algorithms for
indexing and classification.
Representing data as points in a high-dimensional space, so as
to use geometric methods for indexing, is an algorithmic technique
with a wide array of uses. It is central to a number of areas
such as information retrieval, pattern recognition, and
statistical data analysis; many of the problems arising in these
applications can involve several hundred or several thousand dimensions.
We investigate here how techniques based on low-dimensional
projections and the VC-dimension can lead to
heuristics for the nearest-neighbor problem;
we also discuss a probabilistic model for classification
problems in which one can argue rigorously that
classification based on nearest neighbors is a good idea.
(8) The singular value decomposition and high-dimensional clustering.
The relation between eigenvectors and clustering
already appears in previous topics in the course;
but when the matrices involved are derived from
a collection of text documents, this leads
to interesting heuristics for information retrieval.
Similarly, when we construct matrices to represent
a hypertext corpus, we obtain a family of
ranking algorithm for pages on the World Wide Web.
Deerwester, S., Dumais, S. T., Landauer, T. K., Furnas, G. W.
and Harshman, R. A.
Indexing by latent semantic analysis.
Journal of the Society for Information Science, 41(6), 391-407 (1990).
Authoritative sources in a hyperlinked environment.
Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998.
Extended version in Journal of the ACM 46(1999).
Also appears as IBM Research Report RJ 10076, May 1997.
S. Brin and L. Page.
The Anatomy of a Large-Scale Hypertextual Web Search Engine.
Proc. 7th International World Wide Web Conference, 1998.
C. H. Papadimitriou, P. Raghavan, H. Tamaki, S. Vempala.
Latent semantic indexing: A probabilistic analysis.
ACM Principles of Database Systems, 1998.
Y. Azar, A. Fiat, A. Karlin, F. McSherry, J. Saia.
Spectral analysis of data.
Proc. 33rd ACM Symposium on Theory of Computing, 2001.
P. Drineas, Ravi Kannan, Alan Frieze, Santosh Vempala and V. Vinay
Clustering in large graphs and matrices.
Proc. of the 10th ACM-SIAM Symposium on Discrete Algorithms, Baltimore, 1999.
The work for the course consists of
four problem sets, and the periodic writing of scribe notes.