## Instructor

• Jon Kleinberg
5134 Upson.
255-3600.

## Overview

CS 683 is a new course designed to serve as an advanced follow-up to CS 681. The goal is to cover recent results and current problems, and illustrate a number of open directions of research in algorithms. The course pre-requisite is CS 681 or equivalent background in algorithms and discrete mathematics.

The course focuses on an inter-related collection of topics centered around randomization, graph decomposition, and methods from high-dimensional geometry. It concentrates on both fundamental techniques and their applications. Techniques include linear programming duality, metric embeddings, random walks, random sampling, spectral partitioning, and spectral clustering. Applications include graph partitioning and its role in approximation algorithms; heuristics for routing; approximate sampling and counting; and high-dimensional clustering and indexing.

## Course Outline

(1) Brief introduction to linear programming and duality.

Linear programming is a powerful algorithmic tool, but we'll start by focusing on an aspect of linear programming that overlaps only partially with algorithmic concerns: the duality theorem. Duality is a powerful proof technique; it originates from the question, "How do we prove that a system of linear inequalities has no solution?"

(2) Graph partitioning via multicommodity flow.

Graph partitioning is a central, if only loosely defined, problem in algorithms: in applications ranging from VLSI design to physical simulation to data mining, one often needs to divide a complicated network into pieces of roughly equal size, connected by relatively few edges. Perhaps due to its generality, graph partitioning lies at the heart of many of the issues we'll see in this course, and a strikingly large array of techniques have been applied to studying it. We start by considering approaches based on multicommodity flow.

(3) Multicommodity flow and the sparsest cut problem.

We now go further into the structure of the multicommodity flow problem. There is a natural ``cut condition'' that clearly limits the maximum amount of flow we can route between multiple terminals in a graph; but how close are the bounds we derive from these cuts to the optimal flow value? In some special cases, elegant graph-theoretic arguments show that the cut condition exactly characterizes the maximum flow value; for the general case, the relationship among these quantities can be related to results on the geometric embedding of finite metric spaces.

(4) Eigenvalues and expansion.

There is another powerful perspective from which we can approach the graph partitioning problem: by looking at the eigenvalues and eigenvectors of a graph's adjacency matrix. A strong separation among the eigenvalues can be shown to imply the lack of sparse cuts in a graph. The analysis of this phenomenon leads to new types of geometric embedding questions; in the special case of planar graphs, we can use an actual plane embedding of the underlying graph to study properties of its eigenvalues.

(5) Random walks, rapid mixing, and approximate counting.

Random walks are a topic of basic interest in combinatorics; they also show up, implicitly or explicitly, in a number of seemingly different problems: space-efficient computation, the dynamics of card-shuffling, the time-evolution of physical systems, and the design of heuristics for approximate counting. We will focus on the latter of these issues, showing how the ability to sample from a large underlying state space can lead to provably good approximation algorithms for enumeration problems. The analysis of these algorithms will expose a number of connections between random walks and eigenvector methods: the stationary distribution of a random walk corresponds to a principal eigenvector, and the rate at which the walk mixes is related to its non-principal eigenvalues.

(6) Random sampling and the VC-dimension.

We return to random sampling methods, and try to understand the following phenomenon: when the underlying space being sampled has a certain kind of ``orderly'' structure, very small samples can provide extremely strong information. There is a general and powerful theory, the notion of VC-dimension, that explains this; after developing the concept of VC-dimension, we'll look at some of its applications in approximation algorithms and computational geometry. and efficient indexing.

(7) High-dimensional nearest neighbor algorithms for indexing and classification.

Representing data as points in a high-dimensional space, so as to use geometric methods for indexing, is an algorithmic technique with a wide array of uses. It is central to a number of areas such as information retrieval, pattern recognition, and statistical data analysis; many of the problems arising in these applications can involve several hundred or several thousand dimensions. We investigate here how techniques based on low-dimensional projections and the VC-dimension can lead to heuristics for the nearest-neighbor problem; we also discuss a probabilistic model for classification problems in which one can argue rigorously that classification based on nearest neighbors is a good idea.

(8) The singular value decomposition and high-dimensional clustering.

The relation between eigenvectors and clustering already appears in previous topics in the course; but when the matrices involved are derived from a collection of text documents, this leads to interesting heuristics for information retrieval. Similarly, when we construct matrices to represent a hypertext corpus, we obtain a family of ranking algorithm for pages on the World Wide Web.

## Coursework

The work for the course consists of four problem sets, and the periodic writing of scribe notes.