BIB-VERSION:: CS-TR-v2.0
ID:: CORNELLCS//TR93-1356
ENTRY:: 1993-10-14
ORGANIZATION:: Cornell University, Computer Science Department
LANGUAGE:: English
TITLE:: Locality in Distributed Computing
AUTHOR:: Panconesi, Alessandro
DATE:: June 1993
PAGES:: 102
COPYRIGHT:: Alessandro Panconesi 1993 - All Rights Reserved
ABSTRACT::
The topic of this thesis is the issue of locality in distributed computing.

A first set of results in this thesis concerns the $\Delta$-vertex coloring 
problem. They are all based on the following result. Let $G$ be a graph such 
that $\Delta \geq 3, G$ is not a complete graph, and such that all of $G$ 
except one vertex $v$ is $\Delta$-colored. Then, it is possible to extend 
this $\Delta$-coloring to all of $G$ by recoloring a path originating from $v$ 
of length at most $O(\log_{\Delta}n)$.

This property allows us to develop several efficient algorithms for 
$\Delta$-coloring. In particular, but not exclusively, in the distributed and 
PRAM models of computation. It also implies a well-known result of Brooks as 
a corollary.

Another set of results concerns network decomposition - a basic notion in 
distributed graph algorithms. We improve the bounds for computing a network 
decomposition distributively and deterministically. Our algorithm computes an 
$(n^{\epsilon(n)},n^{\epsilon(n)})$-decomposition in $O(n^{\epsilon(n)})$ 
time, where $\epsilon(n) = O(1/ \sqrt{\log n})$.

We also show that the class of graphs $\cal G$ whose maximum degree is 
$O(n^{\delta(n)}$, where $\delta(n) = O(1/\log \log n)$, is complete for the 
task of computing a ($\log n, \log n)$-decomposition, in polylogarithmic in 
$n$ time. Completeness is to be intended in the following sense: if we have 
an algorithm $\cal A$ that computes a $(\log n, \log n)$-decomposition in 
polylogarithmic in $n$ time for graphs in $\cal G$, then we can compute a 
$(\log n, \log n)$-decomposition in polylogarithmic in $n$ time for all graphs.

A last set of results concerns the edge coloring problem. We give a randomized 
distributed algorithm to compute an edge coloring of a given network $G$, 
that uses at most $1.6\Delta + \log^{2+\delta} n$ colors, for any 
$\delta > 0$. The running time is $O(\log n)$ and the failure probability is 
at most $\epsilon$, for any fixed $\epsilon > 0$. The algorithm is quite 
simple but requires an interesting probabilistic analysis. At the core of the 
analysis is an extension of the Chernoff-Hoeffding bounds, which are 
fundamental tools used in estimating the tail probabilities of the sum of 
Bernoulli-like trials.
END:: CORNELLCS//TR93-1356
BODY::
Locality in Distributed Computing
Alessandro Panconesi
Ph.D Thesis
93-1356
June1993
Department of Computer Science
Cornell University
Ithaca NY 14853-7501
LOCALITY Ix DISTRIBUTED COMPUTING
A Dissertation
Presented to the Faculty of the Graduate School
of Cornell University
in Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy
by
Alessandro Panconesi
August 1993
Qc Alessandro Panconesi 1993
ALL RIGHTS RESERVED
LOCALITY IN DISTRIBUTED COMPUTING
Alessandro Panconesi, Ph.D.
Cornell University 1993
The topic of this thesis is the issue of locality in distributed computing.
A first set of results in this thesis concerns the A--Hvertex coloring problem. They
are all based on the following result. Let 0 be a graph such that A > 3 0 is not
a complete graph, and such that all of 0 except one vertex v is A-colored. Then,
it is possible to extend this A--Hcoloring to all of 0 by recoloring a path originating
from v of length at most O(log? n).
This property allows us to develop several efficient algorithms for A--Hcoloring.
In particular, but not exclusively, in the distributed and PRAM models of com-
putation. It also implies a well-known result of Brooks as a corollary.
Another set of results concerns network decomposition--H a basic notion in dis-
tributed graph algorithms. We improve the bounds for computing a network
decomposition distributively and deterministically. Our algorithm computes an
(nE(fl), nt(fl))?decomposition in O(n?(?)) time, where ?(n) = O(1/?gn).
We also show that the class of graphs ? whose maximum degree is O(n6(?)),
where 6(n) = 0(1/log log n), is complete for the task of computing a (log n, log n)-
decomposition, in polylogarithmic in n time. Completeness is to be intended in
the following sense: if we have an algorithm A that computes a (log n, log n)-
decomposition in polylogarithmic in n time for graphs in g, then we can compute
a (log n, log n)-decomposition in polylogarithmic in n time for all graphs.
A last set of results concerns the edge coloring problem. We give a randomized,
distributed algorithm to compute an edge coloring of a given network G, that uses
at most 1.6A+log2+? n colors, for any ?>0. The running time is O(log n) and the
failure probability is at most ?, for any fixed ?> 0. The algorithm is quite simple
but requires an interesting probabilistic analysis. At the core of the analysis is an
extension of the Chernoff-Hoeffding bounds, which are fundamental tools used in
estimating the tail probabilities of the sum of Bernoulli-like trials.
Biographical Sketch
Alessandro Panconesi was born in Roma, Italy, on May 20, 1960. aG0de1?s Proof",
a delightful introductory booklet by Ernest Nagel & James R. Newman, had a great
impact on his life; fascinated by the beauty and the grand architectural sweep
of those magnificent ideas, he decided to study Mathematics at the University of
Rome La Sapienza, where he got his Laurea degree in the Fall of 1984. Mathematics
proved too esoteric and difficult and he thought that Theoretical Computer Science
was a good source of more tractable and yet meaningful mathematical problems.
He remained out of school for a couple of years before going back to the academic
world. Thanks to a fellowship of the Italian Ministry of Education he had the
wonderful opportunity to visit the renowned Computer Science Department of
Cornell University for a year.
He was quite impressed by the department and decided to go for a PhD. The
department was graceful enough to accept him in the program.
As for student... life in Ithaca, the following poem of Charles Baudelaire accu-
rately expresses his feelings.
iii
SPLEEN IV1
Quand le ciel bas et lourd pese comme un couvercie
Sur l'esprit 96missant en proie aux 10n95 ennuis,
Et que de l'horizon embrassant tout le cercie
11 nous verse un jour noir plus triste que les nuits;
Quand la terre est change'e en un chacot humide,
Ou' l'Espe'rance, comme une chauve-souris,
S'en va battant les murs de son aile timide,
Et se cognant la te?e a' des plafonds pourris;
Quand la pluie etalant ses immenses trainees
D `une vaste prison imite les barreaux,
Et qu `un peuple muet d `infames araignees
Vient tendre ses filets au fond de nos cerveaux,
Des cloches tout 4 coup sautent avec furie
1When the low, leaden sky presses down like a lid on the suffering spirit in endless travail,
and when from around the horizon's whole rim it pours on us dark daylight more somber than
night;
When the earth is turned into a dank dungeon cell wherein Hope, like a fluttering bat in the
gloom, in vain beats the walls with a timorous wing and bruises her head on ceiling's foul slime;
When rain, hanging out its immense sweeping trains, resembles a vast, murky prison's thick
bars, and a host of vile spiders soundlessly come to spin their damp webs in the depths of our
brains,
Then bells, of a sudden, explode in a rage and fling toward the heavens a dolorous cry, like
wandering spirits in search of a home, beginning an obstinate whimpering plaint.
-And in silent cortege, without music or drums, long black hearses dead-slowly defile in my
soul; Hope, vanquished, weeps; and fierce Anguish, despotic, upon my bowed skull firmly plants
her black flag.
iv
Et lancent vers le ciel un affreux hurlement,
Ainsi que des esprits errants et sans patrie
Qui se mettent a' geindre opinia?rement.
--HEt de longs corbillards, sans tambovrs ni musique,
Ddfilent lentement dans mon ame; l'Espoir,
Vaincu, pleure, et l'Angoisse atroce, despotique,
Sur mon crane incline' plante son drapeau noir.
In September 1989 he was awarded the title of Dottore di Ricerca by the Italian
Ministry of Education.
In August 1990 he was awarded a Master of Science by Cornell University.
On May 25,1993, he successfully defended his Ph.D. thesis at Cornell Univer-
sity.
v
To my Family: Mother, Father, and Sister Silvia.
To my friend Corrado,
and to my friend Julia,
whose smile makes the Sun jealous.
vi
Acknowledgements
My advisor is like an Eagle: steady and majestic, he soars the sky with elegance
covering great distances with minimum effort, and constantly patrolling the earth
with his sharp eyes.
His students are sometimes birds; one day they will fly with poise, able to
control their speed. Sometimes, as in my case, they are explorers, clumsy bipeds
bound to the ground. They will never fly. The Eagle is the companion of the
explorer and escorts him on a voyage through an unknown land.
The sight of the Eagle is comforting: his motion is at the same time precise
and elegant, his speed paired with remarkable acceleration, his vision allows him
to see from afar both the general patterns and the minute details. Sometimes the
Eagle sees that what seemed from the ground a promising path is in fact leading to
a featureless desert. Sometimes he sees that a few scattered bushes soon become
an inextricable, impenetrable forest. Some other times, thanks to his sharp sight,
he sees a trail comfortably lending to the other side of an otherwise awesome and
inaccessible mountain range. At times, he gives a beautiful gift to the explorer;
gradually, step by step, he brings him to the top of a small mountain from where
he can appreciate for a fleeting moment what, for the bird, is the common sight of
the earth seen from the sky.
vii
It is the explorer, however, who finally draws the detailed map of the voyage.
I wish to express my sincere thanks to my advisor, Professor David Shmoys.
Besides his wide mathematical knowledge, and a taste for mathematics which is at
the same time meaningful and elegant, I appreciated and benefitted from his rare
combination of scholarly integrity and academic pragmatism. I also thank him for
his remarkable generosity and for his silent but deep understanding of my needs,
especially those outside of academia. During my thesis research I was supported
by his NSF PYl award CCR-89-96272 with matching funds from UPS and Sun
Microsystems.
My sincere thanks also go to Professor Juris llartmanis for serving on my
committee. I came to Cornell attracted by his fame and I was not in the slightest
disappointed. Besides being an extremely kind and cheerful man, and a great and
enjoyable teacher, he is also an excellent example of what a scientist should, or at
least could, be. I always admired the aesthetic quality of his contributions: simple
and meaningful--Hthe trademark of good science.
Thanks to Professor Frank Keil, of the Department of Psychology, for serving
on my committee and guiding me through my minor studies.
Thanks to Eva Tardos for generously supporting me during my last semester
with funds from her Packard Foundation Fellowship.
Thanks to Sam Toueg for his kind help and encouragement.
My thanks and respect go to Gianfranco Bilardi who has been, and will con-
tinue to be, a great model of integrity and academic excellence. By talking to
him I started to appreciate the role and responsibility of the intellectual, and the
viii
difficulties that they entail. I gratefully acknowledge his guidance and help.
Perhaps the most important and certainly the most enjoyable part of my grad-
uate studies was my collaboration with other students. My heartfelt thanks go to
Aravind Srinivasan; the results in this thesis are joint work with him. Aravind
is a bird. It has always been a pleasure and a source of amazement to watch his
acrobatic flight. Soon he will master his technique and build strength for his wings,
to make his flight not only acrobatic but seamless.
My collaboration with Desh Ranjan was also important and enjoyable. My
only regret is that we did not have more opportunities to work together.
Other students in the department were important for my scientific experience
or were just nice and made my life as a graduate student more bearable: Devdatt
Dubhashi, James Allan, Jim Caidwell, Lorenzo Alvisi, Prasad Jayanti, Captain
Sofoklis Efremidis, and Suresh Chari.
The support staff of the Computer Science Department of Cornell University
is absolutely outstanding. They remind me of cheerful giants, moving heavy stum-
bling blocks with ease. Their genuine smiles were always refreshing and their
readiness to help students, truly incredible. My warm thanks go to all of them,
but especially to Jan Batzer and Carol Ayer.
My warm thanks go to Sridhar Vembu for his wonderful hospitality when I
was in Princeton, and to Sharon Rodgers of the Computer Science Department of
Princeton University for her great help, remarkable generosity, and very enjoyable
company during my semester at Princeton.
Finally, my respectful thanks go to my Family.
ix
Table of Contents
1 Locality in Distributed Computing
2
3
Improved Network Decomposition			14
2.1			Definitions			16
2.2			Improved Network Decomposition			. . 			. . . 			. 			19
2.3			Completeness . . 			27
The Local Nature of A--Hcolorings and Its Algorithmic Applications 32
3.1			Introduction			. . . .			32
3.2			Definitions 			. . . . 						. .			. 35
3.3			Distributed Brooks' Theorem . . .			. 36
3.4			Algoritlims for A-coloring 			. . . . . 			. .			. 52
3.4.1			The Randomized Reduction 			. . . 			.			52
3.4.2			Deterministic Distributed			A--Hcoloring. .			. 57
3.5			Further Applications of the Small			Neighborhood Search			59
3.5.1			A linear processor NC algorithm. . 			. .			. 59
3.5.2			A Sublinear Time Distributed			Algorithm . . 			. .			61
3.5.3			A Sublinear Time Las Vegas			Algorithm. .			. 62
3.6			Lower			Bounds for some Distributed			Coloring Problems			63
3.6.1			Deterministic Coloring of Paths. .			. 63
3.6.2			Optimal Edge Coloring of Bipartite			Graphs . .			64
3.6.3			Randomized Coloring . 			. . . . 			. .			. 67
4 Fast Randomized Edge Coloring Via an Extension of the Chernoff-
Hoeffding Bounds
4.1			Introduction . . . . . .
4.2			Notation . .			. .
4.3			The Chernoff-Hoeffding Bounds			Extension
4.3.1			0-1 Random Variables. .			.
4.3.2			The General Case
4.4			The Edge Coloring Algorithm . . 			. . . 			. .			.
4.4.1			Distributed Edge Coloring of Bipartite			Graphs			.
4.5			Probabilistic Analysis . . . . . 			. . . 			. .			.
70
70
72
73
74
78
80
82
84
x
4.5.1 Analysis of Top Vertices
4.5.2 Analysis of Bottom Vertices
Bibliography
xl
85
86
98
List of Figures
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
3.10
3.11
Steps and walks . .			.
A detour yields a T-path			.			. . .			. . . .
Situation when Vp E Q fl P
If Pj+2 is not simple we find a T-path . . .
If Pj+1 and Pj+2 fail we find a T-path
Qj and Q? cannot intersect .
New bichromatic paths are divergent from the old tree
Bichromatic paths cannot meet			. .			. .
Steps must be scheduled			. .			.
A 5-widget . .
A chain of widgets			. .			.
4.1			x(ei) = Q,			=			e? lost the lottery .
xli
37
40
42
43
45
47
50
51
53
65
66
87
Chapter 1
Locality in Distributed
Computing
The topic of this thesis is the issue of locality in distributed computing, which
is, roughly speaking, as follows. In a distributed network the bottleneck of any
computation is due to communication, ?.e., sending messages from one processor
to another. To minimize cost it is necessary that each processor not send messages
to processors that are far away in the network. Computation should take place
based on local information alone, i.e., a processor should communicate only with
nearby processors. The question is how efficiently a given function can be computed
given this locality constraint.
In order to formulate this question in a mathematically precise way we introduce
the following simple model of distributed computation, which has been extensively
used in the literature. A message--Hpassing distributed network is an undirected
graph & = (V; E) where vertices, or nodes, correspond to processors and edges
correspond to bi--Hdirectional communication links. Each node has its unique ID.
The network is synchronous, i.e., computation takes place in a sequence of rounds;
in each round, each node reads messages sent to it by its neighbors in the graph,
does any amount of local computation, and sends messages back to all of its neigh-
2
bors. The time complexity of a distributed algorithm, or protocol, is given by the
number of rounds needed to compute a given function.
In spite of the totally unrealistic assumption that a node is allowed to perform
any amount of internal computation in one round, this model captures some es-
sential aspects of a distributed computation. In particular, it captures nicely the
issue of locality. A more detailed discussion of the advantages and disadvantages
of this model is postponed to later in this chapter.
Notice that in this model the cost of sending a message from one vertex to
another is proportional to the shortest path between the two vertices, that is, to
the minimum number of edges of any path between them. There is no shared
memory and processors can communicate only by sending messages through the
network. Hence, if we want a protocol to run for t rounds, then each vertex
can communicate only with vertices at distance at most I from it. This is not
so in the P?AM model, where the shared memory allows any two processors to
communicate in one unit of time
For example, suppose that a network with n processors has diameter ?(n) and
that we want the time complexity of our protocol to be O(log n); then every vertex
can communicate with only a small neighborhood of vertices of O(log n) radius.
Yet the network is computing a global function of its data, initially spread across
the whole network.
In this thesis, we study the complexity of several basic graph theoretic prob-
lems in this model. The general problem is of the following type: a network G
is to compute some function of its own topology such as, for example, a maximal
independent set or a vertex coloring. In particular, we are interested in the vertex
coloring, edge coloring, maximal independent set (MIS), and network decomposi-
tion problems.
An independent set in a graph C (V? E) is a set of vertices I C V satisfying
the following independence property: if u and v are vertices in I then there is no
3
edge between them. An independent set is maximal if no vertex can be added
to it without violating the independence property. A vertex coloring of U is a
partition of V into independent sets, called colors. It is well known that coloring a
graph with the minimum possible number of colors is an NP-hard function [GJ79].
A good coloring is often expressed in terms of the maximum degree of G, i.e.,
the maximum number of edges incident on any node, which is denoted by A.
Colorings using A or (A + l) colors are considered good, especially in a parallel
or distributed setting. A matching in a network G is a set of edges such that any
two of them share no endpoints. An edge coloring is a partition of the edges of G
into a collection of matchings, called colors. Network decomposition is a particular
kind of graph decomposition that was introduced to study basic graph problems
in the distributed model of computation [AGLP89]. It will be defined later in this
chapter.
The reason for studying these problems is twofold. On one hand, they seem
to be useful in the context of synchronization and resource allocation problems.
For example, vertex and edge colorings are used to solve classical problems like
the Dining Philosophers Problem of Dijkstra [CS92,Lyn8l,SP88], and computing
a maximal independent set is an important primitive in many parallel algorithms
[KB87,KN88,Kar89,Kar86,KS87,KW85,Lub86,Nao9la]. More examples are given
in the introductory sections of forthcoming chapters. On the other hand, study-
ing these problems in the PR?AM model has proven extremely fruitful because
it initiated the development of theoretical tools of wide applicability--Hi.e., the
derandomization technique of conditional probabilities and the theory of small
probability spaces [AGHP92,BR9l ,Lub86,Lub88 ,MNN89,NN90].
A useful computational resource in distributed and parallel computing is ran-
domization. Randomization seems useful not only as a resource but also, perhaps
surprisingly, as a methodology.
The methodological aspects of probabilistic techniques applied to parallel com-
4
putation are illustrated by the derandomization technique of conditional probabil-
ities and the theory of small probability spaces. The method of conditional prob-
abilities, implicitly introduced by Erdo's & Selfridge and systematized by Spencer
[ES73,Spe87], is a powerful mathematical technique to give constructive proofs
of the existence of certain combinatorial objects. A probabilistic analysis estab-
lishes the existence of the desired object, followed by a deterministic process that
actually constructs it. In many instances this process can be carried out computa-
tionally in a cost-effective way, for example by means of a polynomial-time or an
NC algorithm [Rag9O ,BR9l ,Lub86,Lub88 ,MNN89].
The theory of small probability spaces is another computational offspring of the
probabilistic method [ASE92]. In the probabilistic method one shows the existence
of a combinatorial object by means of a probabilistic analysis. Typically, one shows
that the probability of existence of the object is non zero. It has been noted that in
some cases the probability space used in the analysis can be replaced by a smaller
one. The new space either maintains all the properties that are relevant for the
analysis or approximates the original one. The existence of the combinatorial
object is still ensured under the new conditions. If the new space is small enough,
e.g., of polynomial size, then it can be searched exhaustively in parallel, that is, all
possible random outcomes are tried in parallel. The randomized process is thereby
replaced with a deterministic one [KW85,Lub86,Lub88,AGHP92,NN9O].
In both cases, the final algorithm is deterministic. Randomization is not used
as a computational resource but as an artifact in the analysis. The tremendous
success of these techniques is underscored by the puzzling fact that the only de-
terministic algorithms known to date for certain problems are obtained through
derandomization techniques [KS87,Lub88]. For the maximal independent set prob-
lem such algorithms were the only available ones for a long time [GS,KW85,Lub86]
It is an important open problem, beyond the scope of this thesis, to determine
whether derandomization techniques are possible in a distributed environment.
5
As a resource, randomization is very powerful in dealing with symmetry--Hbreaking,
which is one of the essential problems encountered in the design of distributed and
parallel algorithms. Generally speaking, the problem is that of choosing one among
several indistinguishable alternatives during a parallel computation. To clarify
what symmetry--Hbreaking is let us take an extreme example. Consider n proces-
sors connected together via an anonymous ring, t.e., processors are connected in a
ring topology but have no ID's. Suppose also that the ring is to compute a max-
imal independent set S of itself. Straightforward symmetry arguments show that
no deterministic algorithm is possible; either all processors decide to be in S or
they all decide to be outside of 5.
But if we are allowed to use randomization then the problem becomes easy.
For example, each processor can choose a number in the range [1, n3] uniformly at
random and independently of other processors. Simple calculations show that with
probability at least (1 --H l/n4) all the ID's will be different. Then, for example, we
can run one of the existing deterministic algorithms based on the assumption that
processors have different ID's and compute the desired maximal independent set
[CV86,GPS89,Lub86].
Intuitively, the additional constraint imposed by locality makes the problem of
symmetry-breaking only worse. The central open problem concerning locality is
whether randomization helps. The current state of affairs is that there are very
efficient randomized algorithms for many basic combinatorial problems (usually
related to scheduling, synchronization, and resource allocation), whereas it is not
known if deterministic solutions with comparable running time exist. For exam-
ple, for the problems maximal independent set, maximal matching, (A + 1)--Hvertex
coloring, approximate edge coloring, and network decomposition there are (usually
very simple) randomized, distributed algorithms whose expected running time is
poly--Hlogarithmic in the size of the network [ABI86,It86,LS9l ,Lub86,Lub88,PS92b,
PS92aj. On the other hand, the best deterministic upper--Hbound for the same
6
problems in the distributed model is O(n?(?)), where ?(n) tends to zero as n goes
to infinity [AGLP89,PS92b]. This is better than any root of n but asymptotically
much worse than any poly-logarithmic function of n. Attempts thus far at find-
ing deterministic algorithms have failed but no logarithmic lower bound proof is
available either.
One could hope to develop symmetry-breaking techniques based on the fact
that nodes have different ID's. Cole & Vishkin developed a technique, known as
deterministic coin-tossing, which has several interesting and non-trivial applica-
tions [AGLP89,CV86,GPS89], but in general, randomization is the only way to
deal with symmetry--Hbreaking known so far.
Whether randomization helps in circumventing the locality bottleneck is one
instance of a general and fundamental problem in computer science: to what extent,
if at all, the use of randomization helps. The most general theoretical questions,
such as p BPP or p RNC, are still unanswered, but in more limited domains
some answers are available. In some cases randomized algorithms can be proven
to be better than any possible deterministic algorithm (e.g. oblivious routing in
parallel architectures [Lei92,Rag90]), in other cases randomized algorithms provide
a solution when a deterministic solution is not even possible (e.g. consensus in
asynchronous networks [B083,FLP85]).
In view of the limited power of the distributed model, finding an answer to this
problem with regard to locality, though a difficult task, does not appear impossible.
Some very limited partial answers are already available.
Linial has shown that if n processors are connected in a ring topology then
?(log*n) is a lower--Hbound for computing a maximal independent set and a 3--H
vertex coloring [Lin92]. Interestingly, this matches the upper--Hbound given by a
beautiful algorithm of Cole & Vishkin [CV86]. Naor has shown that in this case
randomization does not help [Nao9lb].
In some cases, it is quite easy to obtain ?(n) lower--Hbounds; for example, in
7
this thesis we show that if a network is a bipartite graph with n processors then
?(n) is a lower--Hbound for computing a A--Hedge coloring, whereas such a coloring
can be computed in NC [LPV8l]. We also show that in this case randomization
does not help [PS92b,PS92c].
Naor & Stockmeyer have investigated the question of what can be computed
in constant time, with and without randomization, and give a series of upper and
lower--Hbounds for some simple kinds of colorings [NS92]. Again, they were not able
to find examples of functions for which randomization helps. Randomization also
does not help for special topologies like rooted trees or constant degree graphs
[GPS89], and planar graphs [Pan].
In order to understand whether randomization helps with regard to locality, net-
work decomposition, a notion introduced by Awerbuch, Goldberg, Luby & Plotkin
[AGLP89], plays a central role. Given a graph G = (V, E), a partition of V into a
collection of clusters is a set C = ?CiJ such that V = UCi and CiflCj = ? if i # j.
The cluster graph Cc induced by C is the graph with vertex set V(Cc) = C and
edge set
E(Cc) = t(Ci,Cj) : i,j distinct, ?u ? Ci,?v ? Cj (u,v) E El.
A (d(n), c(n))--Hnetwork decomposition of C is a set of clusters C with a vertex
coloring of Cc with O(c(n)) colors, such that every C[Ci], the subgraph induced
by vertices in cluster Ci, is connected and has diameter O(d(n)). The requirement
that C[Ci] is connected can be relaxed, see for example [LS91].
A decomposition with O(c(n)) = O(d(n)) = O(log n) is called near-optimal,
since Linial & Saks have exhibited families of graphs for which c(n) + d(n) >
?(j0l?1??g?) and c(n)d(n) > ?(logn) for any (d(n),c(n))--Hdecomposition [LS9l]. A
priori, it is not clear that a near-optimal decomposition should exist at all. Using
a linear time algorithm of Awerbuch & Peleg, Linial & Saks observe that a near-
optimal decomposition exists for any graph C, and give an O(log2 n) expected time
randomized distributed algorithm for this [AP92,LS91].
8
The importance of network decomposition ties in the fact that given a (good)
decomposition it is possible to compute other structures efficiently, e.g. matchings,
independent sets, vertex and edge colorings. More precisely, given a (d(n), c(n))--H
network decomposition we can compute other graph structures, say a maximal
independent set, in O(d(n)c(n)) time. The generic algorithm for such problems,
given a cluster decomposition, will iterate through the color classes, clusters of
color 1 being processed first in parallel, clusters of color 2 being processed next,
and so on. Inside each cluster the trivial algorithm can be used: the leader of
the cluster, say the processor with highest ID, collects complete information on
the topology of the cluster, internally computes a solution, and sends the answer
back to all vertices in the cluster. (This solution is mainly of theoretical interest
because it might involve large message size and significant internal computation.
In some cases, however, it can be useful.) The bounds on the cluster diameter and
the number of colors used, yield the bound on the time complexity of this generic
algorithm.
Hence, given a near-optimal decomposition, other graph structures like max-
imal independent sets, maximal matchings, and (A + 1 )--Hvertex colorings can be
computed deterministically in poly-logarithmic time. In other words, near-optimal
decomposition is complete for a reasonably large class of problems. This class
includes maximal independent set, A--Hvertex coloring, maximal matching, mini-
mal edge cover, and many others. So, if a near-optimal decomposition can be
computed in polylogarithmic time deterministically, then randomization does not
"help" in computing all these various graph structures. For this reason, it is a
major open problem in this area to determine whether a near-optimal decompo-
sition can be computed deterministically in the distributed model of computation
in O(polylog(n)) time, i.e., in polylogarithmic in n time.
Let us return to the issue of locality. Some graph structures trivially have a
locality property. Consider for example the (At 1)-vertex coloring problem: given
9
a graph G = (V? E) we want to color the vertices of U using (A t 1) colors, in
such a way that adjacent vertices always have different colors. A, we recall, is the
maximum degree of the network, i.c., the maximum number of edges incident on
any node. In a sequential setting, this problem is trivial: pick one vertex v, color
it with a color not chosen by any of its neighbors, and repeat. We can decide
the color of a vertex u just by looking at its immediate neighbors, i.e., a local
search of radius 1. But other problems do not seem, a priori, to satisfy this locality
requirement. For example, if the network is to compute a A--Hvertex coloring then
we might face a situation where an uncolored vertex of degree A does not have a
free color readily available, since all A colors are being used by its A neighbors. In
order to assign a valid color we then might be forced to recolor a large part of the
network.
A first set of results in this thesis concerns the A--Hvertex coloring problem. They
are all based on the following result on the "local" nature of A--Hvertex colorings,
that we call the Small Radius Search Theorem. Let U be a graph such that A> 3
G is not a complete graph, and such that all of C except one vertex v is A--Hcolored.
Then, it is possible to extend this A--Hvertex coloring to all of G by recoloring a
path originating from v of length at most O(log? n) [PS92b,PS92c].
This non-trivial "locality" property of A--Hvertex colorings allows us to develop
several efficient algorithms for A--Hvertex coloring [PS92b,PS92c]. In particular, but
not exclusively, in the distributed and parallel (PRAM) models of computation.
It also implies the following well-known results of Brooks as a corollary [Bol79,
Bro4l]: a graph G can be A colored if and only if it is neither an odd cycle nor
a complete graph. A more detailed list of all the results stemming from the Small
Radius Search Theorem are given in the introduction to Chapter 3.
Another set of results concerns network decomposition, whose importance we
have already discussed. In this thesis, we give a deterministic, distributed algo-
rithm for computing a (nC(fl), nt(fl))?decomposition in O(n?(?)) time, where e(n) =
10
l/????. This is a slight improvement over the previously known best upper--H
bound, for which e(n) = ?gn/V\??? [AGLP89,PS92b].
We also show that the class of graphs g whose maximum degree is A =
where ?(n) = 0(1/log log n), is complete for the task of computing a near-optimal
decomposition in O(polylog(n)) time. What we mean is that if we have an algo-
rithm A that computes a near-optimal decomposition in O(polylog(n)) time for
graphs in g then we can compute a near-optimal decomposition in O(polylog(n))
time for all graphs. This is actually a corollary of a more general characterization,
see Chapter 2.
A last set of results concerns the edge coloring problem. We give a randomized,
distributed algorithm to compute an edge coloring of a given network U, that
uses at most 1.6A + log2+? n colors, for any ? > 0. The running time is O(log n)
and the failure probability is at most e, for any fixed e > 0. The algorithm is
quite simple but requires an interesting probabilistic analysis. At the core of the
analysis is an extension of the Chernoff-lloeffding bounds, which are fundamental
tools used in estimating the tail probabilities of the sum of Bernoulli-like trials
[Che52,Hoe63,Sri93,Rag9O]. Let X = ZiE[n] Xj be the sum of n independent 0-1
random variables, such that E[X] = ?. Roughly speaking, these bounds state that
the probability of X being (1 + ?)-away from? goes to zero exponentially fast, as
n goes to infinity. More precisely, the probability
? -1
e
Pr(X > (1 + S)?) --H (l+?(i+?
The hypothesis that the Xj's are independent is used crucially in the proof of these
bounds. Our extension concerns a certain case of dependence among the Xj's that
we call A-correlation. The intuition behind this extension is as follows. Suppose we
have a set of points 5, and a family of n subsets A = fAj : Aj C 5, i E [n]J. Each
point of 5 flips a coin independently of the other points. Let Xi be the indicator
random variable of the event "All points in Aj come up head". Let X = ZiE[n] Xi,
l1
and E[Xj = ?. Notice that Xj and Xj are independent if and only if Aj and Aj
are disjoint. Moreover, if Aj fl Aj # ?, the conditional probability
Pr(Xi = 1 Xj = 1) > Pr(Xi = 1)
which in turn implies that, for all nonempty I c
Pr AXi=l
jEl
llPr(Xi = 1)
It also implies that given that Xi,. . , Xj are all 1 then it is more likely that Xj+1
is also 1. The extent to which this is more likely depends on how the Aj's overlap.
If the overlap is very significant, large deviations of X from its mean are quite
likely. Suppose, on the other hand, that the Xj's are A-correlated, that is, there
exists A such that, for all nonempty I c
Pr j)1Xi =1 <A tE?iP?(Xi = 1)
In this case, given that ..... . , Xi are all 1 then the probability that Xj+1 is also
1 is not too different from the unconditional probability, and large deviations from
the mean are not likely. What we prove is that if the Xj's are A-correlated random
variables and X is the sum of the Xj's then
Pr(X > (1 +?i) <A F(?,?.
This extension of the Chernoff-lloeffding bounds has also other interesting ap-
plications, some of which are related to classical problems in combinatorics [Sri].
Let us return to the model of computation to analize the advantages and dis-
advantages of several simplifying assumptions. As we shall see, the model has the
advantage of simplicity without loosing much in accuracy.
In a distributed network the cost (in terms of time) of sending one message to a
neighbor is many orders of magnitude bigger than that of executing one elementary
12
operation internally in one node. This justifies the approximation of our model that
allows for any amount of internal computation to be executed in one round.
In the analysis of our algorithms we then have to make sure that only "reason-
able" amounts of internal computation are executed. This is usually the case for
the algorithms in this thesis, with the only possible exception, depending on the
definition of "reasonable", of some simulations involving network decomposition.
In general, it is quite easy to give an accurate estimate of the total running time
of our algoritlims, including internal computation, but such estimates are omitted
in the thesis.
Another unrealistic assumption that is made in the model is that every proces-
sor is allowed to send messages to all of its neighbors in unit time; it would seem
more reasonable to charge a cost proportional to the degree of the node. Again this
simplifying assumption can be justified on the grounds of hardware considerations.
Also, it allows us to avoid cumbersome details in the combinatorics. If needed, the
appropriate costs incurred by our algorithms can be computed quite easily.
Another relevant quantity that it is not explicitly taken into account is message
size. Given that we are only concerned with problems where a network G = (V? E)
is computing a function of its own topology, an obvious upper bound on the size
of any message is 0(1 v + EJ). In general, again with the exception of simula-
tions involving network decomposition, message size required by our algorithms is
constant or O(log n). As before, the precise estimate of such cost is omitted but
can be quite easily derived.
An important aspect of this model of distributed computation is that it can be
simulated in the PR?AM model by linearly many processors with a time overhead of
O(MAxMEssAGESIzE), where MAxMEssAGESIzE is an upper bound on the size of
any message. More precisely, if a network C = (V? E) computes a function f in time
O(t(n)), then the PRAM simulation takes O(t(n)MAxMEssAGES!zE) time, using
0(1 VI + I El) processors. So, if the message size is constant or poly-logarithmic,
13
then a very efficient distributed algorithm translates into a very efficient PRAM
algorithm. This is the case, for example, with the ?--Hvertex coloring algorithm in
Chapter 3.
We also made the assumption that each processor in the network has its own
unique ID. As we briefly discussed, symmetry considerations show that in an anony-
mous network there is practically nothing that can be computed deterministically.
One could also consider a more economical assignment of ID's, for example one
where neighboring vertices have different ID's. But this would not change things
in any essential way.
Finally, it should be noted that, as far as lower bounds are concerned, this
model is very general; any lower bounds derived in this model directly holds in any
model charging for other additional costs incurred during the computation.
Chapter 2
Improved Network
Decomposition
Two important and inter--Hrelated problems in the distributed model are to compute
a maximal independent set (MIS) and to compute a good vertex coloring of a net-
work &, say a (A + 1)-coloring (A is the maximum degree of any vertex in C). An
MIS defines a set of processors which can compute in parallel without interference,
and a coloring is a partition of V into independent sets, thus defining a schedule
for the processors to compute in parallel, without interfering with their neighbors.
There are very simple distributed randomized algorithms of Alon, Babai & Itai
and Luby that compute an MIS and a (A + 1)-coloring in O(log n) expected time
[AB186,Lub86,Lub88]. While these papers also show how to derandomize these al-
gorithms in NC, it is an outstanding open question whether the derandomization
can be carried out in the distributed model
In order to solve the MIS and related problems in the distributed model, Awer-
buch, Goldberg, Luby & Plotkin introduced the notion of network decomposition
(sometimes also called cluster decomposition) [AGLP89]. Let us first review the
definition. Given a network U = (V, E) and a partition of V into a set of clusters
C = fCi?, the cluster graph Cc induced by C is the graph with vertex set V(Cc)
14
15
and edge set
E(Gc) = f(Ci,Cj) : i $j, ?u ? Ci ?v ? Cj (u,v) e EJ.
A (d(n), c(n))--Hdecomposition of & is a partition of V into a set of clusters C such
that
o+ every G[Cij, the subgraph induced by vertices in Ci, is connected and of
O(d(n)) diameter, and
o+ the cluster graph is vertex colored with O(c(n)) colors.
Problems like MIS and (A + 1)--Hcoloring can be solved in O(d(n) c(n)) time, given
a (d(n), c(n))-decomposition of C. The generic algorithm for such problems, given
a cluster decomposition, will iterate through the color classes, clusters of color 1
being processed first in parallel, clusters of color 2 being processed next, and so on.
Inside each cluster the trivial algorithm can be used: the leader of the cluster (say
the processor with highest ID) collects complete information on the topology of the
cluster, internally computes a solution, and sends the answer back to all vertices in
the cluster1. The bounds on the cluster diameter and the number of colors used,
yield the bound on the time complexity of this generic algorithm.
Network decomposition has also other interesting applications, mainly, but not
exclusively, to distributed computing [Awe85,AGLP89,AP92,AP90,BC].
Linial & Saks showed how to compute a near-optimal decomposition (that
is, one with O(c(n)) = O(d(n)) = O(logn)) in O(logn) expected time by using
randomization, although they relax the requirement that the C[Ci]'s be connected
[LS9l].
In this chapter, we (slightly) improve the bounds of Awerbuch, Goldberg, Luby
& Plotkin for computing a network decomposition distributively and determinis-
`This solution is mainly of theoretical interest because it might involve large message size and
significant internal computation.
16
tically. We show how to compute a (n?(fl), nC(fl))?decomposition in O(n?(?)) time,
where e(n) = O(1/V\Thgn) (as opposed to their e(n) =
As a corollary of this improved network decomposition result we obtain im-
proved deterministic bounds for computing an MIS, (A + 1)--Hcoloring, and other
graph problems.
We also show that the class of graphs ? whose maximum degree is A =
where S(n) = 0(1/log log n), is complete for the task of computing a near-optimal
decomposition in O(polylog(n)) time. This is actually a corollary of a more general
characterization. Completeness is to be intended in the following sense: if we have
an algorithm A that computes an near-optimal decomposition in O(polylog(n))
time for graphs in g then we can compute an near-optimal decomposition in
O(polylog(n)) time for all graphs. This completeness result pinpoints precisely
what graphs are the most difficult to handle and we feel that this would help
future attempts at solving this problem.
2.1 Definitions
A message--Hpassing distributed network is an undirected graph U = (V? E) where
vertices, or nodes, correspond to processors and edges to bi--Hdirectional communi-
cation links. Each processor has its unique ID. The network is synchronous, ?.e.,
computation takes place in a sequence of rounds; in each round, each processor
reads messages sent to it by its neighbors in the graph, does any amount of local
computation, and sends messages back to all of its neighbors. The time complexity
of a distributed algorithm, or protocol, is given by the number of rounds needed to
compute a given function.
The absence of a shared memory imposes a locality constraint in the following
sense: if we waut a protocol to terminate within t rounds, every vertex can com-
municate with only the vertices which are at a distance of at most t from it. This
model is hence suited for studying the complexity of a problem when communica-
17
tion is the bottleneck. In this model we do not charge for local computation; in
one round each processor is allowed to compute any function of its current data. In
our algorithms however, each processor will perform very simple computations. In
particular, each step can be simulated in 0(A) by a single processor or in constant
time with A processors, where A denotes the maximum degree of any vertex in
the network.
Given a graph & = (V, E) and a set 5 C V, &[S] denotes the subgraph induced
by 5. By V(G) and E(&) we denote the set of vertices of & and the set of edges of
& respectively. The degree of a vertex v in a graph & is denoted by deg?(v). The
distance between two vertices u and v in a graph &, i.e., the length of a shortest
path connecting them in &, is denoted by dG(u, v).
Given a graph & = (V? E) and a set 5 C V, we denote by &[k, 5] the graph
whose vertex set is V(&[k, 5]) = 5 and whose edge set is
= ?(u,v) u,v ? 5,1 < dG(u,v) ? kJ.
We denote by [n] the set ?l,2,. . ,ni A vertex coloring of a graph & =
with n vertices is a mapping ? : V H [n] such that if (u, v) E E then y(u) # x(v).
A P--Hcoloring of & (i.e., a coloring that uses at most ? colors) is trivially an (OL, P)--H
decomposition of &, for any ? > 0. Given a graph & of maximum degree A, it is
possible to compute a A + 1 coloring of & in O(A log n) time in the distributed
model of computation [GPS89].
The following definition was introduced by Cole & Vishkin [CV86]: an (Q, P)-
rtthng set 5 with respect to & = (V? E) and P c V is a set of vertices such
that:
5 c p
o+ any two vertices of 5 are at distance at least a from each other;
o+ every vertex u E P is at distance at most p from some vertex of 5.
18
The notion of (a, P)--Hruling set was generalized by Awerbuch, Goldberg, Luby
& Plotkin [AGLP89J as follows: an (a, P)-rnling forest with respect to U = (V? E)
and P c V is a forest of rooted trees ? = ?TiJ, where each tree is a subgraph of
G, with the following properties:
o+ For all i, the root of Ti, called the leader of Tj and denoted by l(Ti), is in P,
o+ every vertex in P belongs to a unique tree,
o+ trees are vertex-disjoint,
o+ inter-root distance is at least a, and
o+ tree depth is at most ?.
Notice that trees of an (a, P)--Hruling forest can contain non-P vertices. In the
distributed model of computation, a (k, k log n)--Hruling set can be computed in
O(klog n) time deterministically [AGLP89]. Given an (a, P)--Hruling set, an (a, P)--H
ruling forest can be computed in 0(P) time deterministically, so that a (k, k log n)--H
ruling forest can be computed in O(k log n) time distributively [AGLP89i.
Suppose we have a graph U with a partition of V(G) into A and B where the de-
gree of each vertex in B is at most P--H 1, and are also given an (a, P)--Hdecomposition
of G[A] and a P--Hcoloring of G[B]. Then, we can compute an (a, P)--Hdecomposition
of G in 0(P) time. We call this the merging of the two decompositions (recall that
a P-coloring is an (1, P)-decomposition). The merging can be computed as follows.
Let C be the cluster set of the decomposition of G[A]; the cluster set of the new
decomposition is C u B. Colors of clusters in C remain the same, while colors of
vertices in Bare updated according to the following procedure: for c 1,2,. ..,P,
in parallel each vertex with color c chooses a color in [?] not chosen by any of its
neighbors. This procedure is correct because the set of vertices with color c is an
independent set, and neighboring vertices will choose different colors.
19
We now introduce a notation in the spirit of the O(.) notation used in Corn-
putational Geometry. We say that g(n) = O(f(n)) if there exists c> 0 such that
g(n) = O(f(n)c). This convention is introduced to simplify notation.
2.2 Improved Network Decomposition
In this section we present our improved network decomposition algorithm. From
now on, U = (V? E) will denote the original input graph with n vertices, while H
will denote a generic graph with ? vertices. Also, p = p(n) will denote a parameter
to be fixed later; the performance of the algorithm will depend on the choice of
p. The algorithm uses two mutually recursive procedures CD(H) to compute an
(n?(n), n?(?))--Hdecomposition of il with ?(n) = O(5TiThgn) and 1Z?(P, H), to
compute a (3,4)--Hruling forest with respect to P and H. The running time of
CD(.) is O(n?(?)). The intuition behind CD(H), as in [AGLP89], is the following:
"small" degree vertices (i.e., vertices of degree less than p) can be handled easily
by making them trivial clusters and by pcoloring the graph induced by them in
O(plog n) time. "lligh" degree vertices (i.e., vertices of degree at least p) can be
used to shrink the graph by collapsing their neighborhood into one super-vertex;
the resulting graph will shrink (in terms of the number of vertices) by a factor of
at least p. After the shrinking, a network decomposition of the collapsed graph
is computed with a recursive call to CD(.), and if the shrinking factor p is high
enough the recursion will terminate fast. Finally, the two decompositions, the one
of the collapsed graph and the trivial decomposition given by the p-coloring of the
small degree vertices, are merged together to give a decomposition of the whole
graph.
A symmetry-breaking problem arises when any two high degree vertices want to
collapse and their neighborhoods intersect. This problem is handled by computing
a (3,4)--Hruling forest with respect to the set P of high degree vertices and the
graph H. In a (3,4)--Hruling forest, each vertex in P belongs to a unique tree, so
20
that each tree can collapse onto its root without interference from other collapsing
trees. Since roots are "high" degree vertices (i.e., have degree at least p) and are
at distance at least 3 apart, the shrinking factor is at least p.
An important difference between our algorithm and that of [AGLP89] is that
we compute a (3,4)--Hruling forest with respect to the set of high degree vertices,
while they compute a (3, 3 log n)--Hruting forest. Computing a (3,4)--Hruling forest
gives much better performance for the following reason. Once the ruling forest
is computed, each tree is collapsed into a super-vertex: if each tree has O(log n)
diameter, as in the (3,3 log n)--Hruling forest, then by the end of the recursion we
will have super-vertices with O((6 log n)d) diameter, where d is the depth of the
recursion. On the other hand, with (3,4)--Hruling forests the final diameter will be
0(8d), which will greatly improve the performance of the simulation of a super-
vertex. Any (3, k)--Hruling forest, for k constant, will do but the smaller the k the
better; k = 4 is what we could achieve. The problem is that it is not known how
to compute a (3,4)--Hruling forest distributively in polylogarithmic time, whereas
this is possible for (3,3 log n)--Hruling forests. We solve this problem by a mutually
recursive call to CD(.). Roughly speaking, the problem is solved by considering
the graph induced by vertices at distance at most two from a vertex in P and by
computing a maximal independent set in this graph. The maximal independent
set is computed by partitioning P into roughly balanced sets, each of size O(/p),
and by making (mutually) recursive calls in parallel to CD(.) on graphs of smaller
size, namely O(( V(H) lip).
We now give CD(H).
PROCEDURE CD(H)
o+ INPUT: a graph H with  vertices.
o+ OUTPUT: an (?(t), n?(?))--Hnetwork decomposition of H.
1.			Let P =			deg?(v) > pi. Compute a (3,4)--Hruling forest with respect to
21
P and H with a call to 1t?(P, H). Let F = ?TjJ be the resulting forest, and
let H? be the graph induced by ?, i.e., V(H?) = I Tj E ?? and
E(H?)=?(Ti,Tj)I i#j,?uET?,veT5: (u,v)EE(H)).
Let 5 be the set of vertices not covered by the forest, i.e., 5 = fu u
UiV(Ti)J.
2. Compute a network decomposition of H? by a recursive call to CD(H?).
3. Compute a ?cAonng of H[S], the subgraph induced by 5, and merge it with
the network decomposition computed by CD(HF) (see Section 2.1).
First of all, observe that
IWH?I ? IV(H)I
p
because H? is formed by collapsing the trees of ? around their leaders, which are
vertices of degree at least p. Hence, the depth of the recursion is at most d = log? ?.
We now argue by induction that the diameter and the colors used by the network
decomposition are, respectively, 0(81o?p ?) and p. We first prove the claim for the
diameter. For the base case, observe that when the recurrence stops each tree has
diameter at most 8. For the inductive step, assume that the diameter of clusters
returned by CD(CF) is 8log? V(Uj) . Each vertex of G? is a super-vertex obtained
by collapsing a tree and has diameter at most 8. Hence, the diameter bound for
CD(H) is
8 8Iog? V(H?) < 8 8Iog? ?e --H 81og? 
To prove that p is the maximum number of colors used by CD(H) assume
inductively that CD(H?) uses at most p colors. By definition of 5 the graph H[S]
can be pcolored and the merging of H[5] and CD(HF) also uses p colors. To
24
and H[Qi]. The set I = UiIi is a (3,2)--Hruling set w.r.t. Q UiQi and Hand, by
construction of the Qi's, a (3,4)--Hruling set w.r.t. P and H.
The time complexity of lZF(P, H) is given by the recurrence
< O(log n) + O(p) + 2 Tcv
where O(log n) is the time necessary for Step 1, O(p) is the time necessary for
step 2 (there are p phases), and where 2 Tcv(2C/p) is the time needed for the
recursive call to CD(.), which is called on a square graph (each edge of H[2, Q?]
can be simulated in 2 steps), and O(p) is the time needed to compute each A.
We now determine the best choice for the parameter p. Our goal is to minimize
the running time of CD(.). By substituting T?F(?) into the equation of Tcv(i) we
get (in what follows let q = pi2 and assume p > log n)
ri'cv(?) ?
O(logn) +t)(p) +2Tcv 2?
+8 TcV(p) +O(p logn)
<			O(plogn)+10Thv
<			O(p logn) 101ogq
By computing the first and second derivatives of the function log f(n, p) with re-
spect to p (recall q = p/2), we can see that the minimum of f(n,p) is attained
when p = 20(V\Thgn) = 6(nC(fl)) with e(n) = 1/4X??n. For this choice of p we get
Tcv(n) --H t)(nE(n)). The following theorem summarizes the whole discussion.
Theorem 1 Given a graph G with n vertices, procedure CD(G) computes an
(nt(fl), nC(fl))?network decomposition of C in &(nc(n)) time, where e(n) = i/?gn.
We now present a simple scheme to construct a (log n, log n)--Hdecomposition,
given an algorithm to compute a (d(n), c(n))--Hdecomposition. This is inspired by
ideas from [AP92,BCj.
25
For this, we first need the sequential algorithm of Awerbuch & Peleg [AP92]
to compute a (log n, log n)--Hdecomposition, which works as follows on a graph C =
(V? E) with V? = n. Start with any vertex v. Now, either there exists an index i,
0 <i < flog2n? --H 1, such that
I?u: d?(u,v)<?i1I?>I?u: dc(u,v)=i+l1I,
or not. If there exists no such index, then C has O(log n) diameter and hence
a trivial (log n, log n)-decomposition. Otherwise, let ?v ? log n be the smallest
such index; let Cv = ?ttIdG(u, v) < ?v1 be the cluster "centered" at v and Bv
tuldo(u, v) = v + 1J its "border". Note, crucially, that
IBvI<?iCvI.			(2.1)
Remove the vertices CvUBv and the edges incident at them, and repeat this process
on the remaining graph. We then get a sequence of vertices v =vo,vl,V2,... with
corresponding clusters Cv0, ....... Each set Cvi now becomes a cluster and gets
color 1; note that since each ?v is O(log n), the diameter of each cluster Cvi is
O(logn). Also, no two clusters Cvi and Cvj, Vj ? Vj, have an edge going from one
of them to the other. Hence, these are valid clusters indeed.
Now by removing the set UjCvj from G and repeating this on the remaining
graph G[UiBvj] to assign color classes 2,3,...? we compute a network decomposi-
tion. The crucial property is that the number of vertices in the new graph C[UiBvj]
is at most half that of C, by (2.1); thus, the number of color classes is at most
[log2 n]. Hence, this yields a (log n, log n)--Hdecomposition.
The next theorem shows that, given a (c(n), d(n))-decomposition, the above
linear-time algorithm can be efficiently simulated in the distributed model of com-
putation.
Theorem 2 Sttppose we are given a distributed atgorithm A with running time
O(t(n)), to compute a (d(n),c(n))--Hdecomposition. Then, given any network C =
26
(V E) with n vertices, we can compute a (log n, log n)--Hdecomposition of C in
O(t(n') log n + c(n)d(n) log2 n) time distributively.
PROOF. For this, we rely heavily upon the above-seen sequential scheme; we
proceed as follows.
1. Use algorithm A to compute a (log n, log n)--Hdecomposition of G[2 log n, V].
2. For newcolor = 1,2,... ,log n do:
o+ For c = 1,2,... , O(c(n)) do: in parallel, each cluster C of color c com-
putes a maximal collection of sets Cv, for v E C, and assigns color
newcolor to them.
The crucial observation is that if u and v belong to two different clusters Ci and
C2 of color c, then Ct' and Cv generated in the body of the loop do not interfere
because, by step 1, dG(u, v) > 2 log n + 1 and the radius of C? and Cv is at most
logn.
Step 1 takes O(t(n) log n) time, since it takes O(log n) time to simulate each
edge of U[2 log n, V]. The simulation of the body of the nested loop in step 2 takes
O(d(n) log n) time. Thus, step 2 takes O(c(n)d(n) log2 n) time. 0
As mentioned in the introduction, several problems are reducible to network
decomposition.
Corollary 1 Given a network C with n vertices, the following functions can be
computed in 6(n?(?)) time in the distributed model of computation, with e(n) =
o+ maximal independent set,
o+ (2A --H 1)--Hedge coloring,
27
o+ maximal matching,
o+ A--Hvertcx coloring [PS92c],
o+ (log n, log n)--Hnetwork decomposition
The first three statements are a straightforward application of the general algo-
rithm discussed in the introduction. The proof of claim 4 can be found in Chapter 3
(Section 3.5.2) or in the references, and the last claim follows from Theorems 1 and
2.
2.3 Completeness
In the previous section we gave an algorithm for computing a network decom-
position of a graph in time 6(g(n)), where g(n) = 2? In this section we
characterize a class of graphs g that is complete for the task of computing a net-
work decomposition. Here, completeness is to be interpreted in the following sense:
if we have an algorithm for computing a decomposition for graphs in g in 0(1(n))
time, then we can compute a decomposition for all graphs in 0(1(n)) time.
More precisely, let h(n)
and let q(n) = g(n)h(fl).
be any non--Hdecreasing function, let p(n) = g(n)l/h(n),
Suppose we have an algorithm A that computes a
(p(n), p(n))--Hdecomposition of graphs with maximum degree A < q(n) in time
O(p(n)). Then, we can compute a (p(n),p(n))--Hdecomposition in time 0(p(n)) for
all graphs. This means that we need to concentrate our efforts on graphs with
maximum degree at most q(n); these are the difficult graphs to handle. For exam-
ple, in order to have a (polylog(n), polylog(n))-decomposition algorithm running
in 0(log n) time we just need to look at graphs of maximum degree less than
q(n) = ?O(l/loglogn), a quantity smaller than ne for any e.
We may further add that since it is possible to (A + 1 )--Hvertex color graphs in
time O(A log n) [GPS89], where A denotes the maximum degree of the graphs,
the class of graphs that is complete for decomposition is that of graphs with
28
E [?(p(n)), O(q(n))]. For example, for p(n) polylog(n) the value of h(n) is
4mogn/ log log n, and the range becomes [?(poiYJog(n)), O(n'(?))], where ?(n) =
1/loglogn.
These results tell us what the bottleneck is for the method used in [AGLP89?
and in this paper. The method's basic idea is to expand clusters as long as their
degree is high enough, and to color clusters when their degree is not high enough.
This approach is successful if the final diameter of the resulting clusters is not too
high, i.e., if the recursion terminates fast. For this to happen the expansion rate
of the clusters has to be high enough. But if the maximum degree is in the range
(?(po1yiog(n)), O(n8(?))), corresponding to p(n) = polylog(n), then the maximum
degree of clusters is too low and the diameter too high for the shrinking to terminate
in polylogarithmic time. This is an indication that a completely different approach
is needed to solve the network decomposition problem in O(polylog(n)) time.
We now turn to the task of showing the completeness result. The idea is to use
the supposed algorithm A as a subroutine in conjunction with a modified version of
the procedures CD(.) and ltF(.). As before, the two procedures will call each other
in a mutually recursive fashion. The initial input is a graph G with n vertices; the
values p(n) and q(n) remain constant in the following analysis.
As before, procedure CD(H) splits vertices into "high" and "low" degree ver-
tices. Here, high degree means at least q(n). But, instead of coloring the graph
induced by the low degree vertices as before, algorithm A is invoked. Another
difference is that procedure 1t?(P, H) returns a (3,6)--Hruling forest w.r.t. P and
H, instead of a (3,4)--Hruling forest. On input graph H, procedure CD(H) is as
follows:
PROCEDURE CD(H)
o+ INPUT: a graph H with ? vertices.
. OUTPUT: a (p(?),p(n))-network decomposition of H.
29
1. Let P = deg?(v) > q(n)?. Compute a (3,6)--Hruling forest with respect
to P and H with a call to 1??(P, H). Let ? = fTjl be the resulting forest,
and H? be the cluster graph induced by F. Let S be the set of vertices not
covered by the forest, i.e., S = ftt u ? UiV(Ti)J. (Comment: all this is the
same as before).
2 Compute a network decomposition of H? by a recursive call to cD(H?).
3. Compute a (p(n), p(n))--Hdecomposition of H[S], the subgraph induced by S,
by using algorithm A, and merge it with the network decomposition corn-
puted by CD(H?).
The time complexity of CD(H) is given by the following recurrence
Tcv(?) < T?j(?) + 12 Tcv +
O(p(n)) is the time needed to merge the decomposition computed by A on H[5]
and the one computed by CD(H?).
The modified version of 1Z?(P, H) is trickier to design. Intuitively, q(n) is
defined in such a way that if we partition P into groups of size O(/q(n)), and
call CD(.) in parallel on such groups, then the recursion will terminate in 6(p(n))
time. The problem is that the requirements of a (3, 6)--Hruling forest can be satisfied
locally in each group, but might be violated by vertices belonging to different
groups. This problem was eliminated in the old algorithm by a "filtering" process
of cycling through the groups. This is not possible now because there are q(n)
groups and we cannot afford to run a loop for that long. The problem is solved by
invoking algorithm A on a suitable interference graph whose maximum degree is
at most q(n).
PROCEDURE 1t?(P,H)
o+ INPUT: a graph H with ? vertices, and a set P c
30
o+ OUTPUT: a (3,6)--Hruling forest with respect to P and H
1. Partition P into disjoint sets Pi, i ? [q(n)], as before, by computing a
(3,3 log n)--Hruling forest and by assigning numbers 1, 2,... , q(n) cyclically
inside each tree.
2. Let Hi = H[4, Pi], z E [q(n)]. In parallel, for all i e [q(n)], compute an MIS
Ii of Hi by means of a call to CD(Hi). Let 1 = UiIi.
3. Let F = H[2, I]. (Comment: we will show that the maximum degree of F is
< q(n).) Invoke algorithm A to compute an MIS J of F. The set J is
a (3,6)--Hruling set w.r.t. P and H. From J a (3, 6)--Hruling forest is computed.
We first show correctness of the algorithm and then that it achieves the desired
time bounds. The set P is partitioned in step 1 with the same method employed
in the old algorithm; it follows that I Pi I < 2/q(n), ? E [q(n)]. The correctness
of step 2 follows from the general fact that given an (a, P)--Hdecomposition an MIS
can be computed in O(oP). For step 3, we have to show that A(F) ? q(n) and
that J is a MIS. Consider any vertex u ? 1? c I = V(F). First notice that u
cannot have a neighbor v E Ii because if u, v E Ii then, by definition of Hj and
Ii, dH(u,v) > 5 and hence (u,v) ? E(F). We will show that any u ? can have
at most one neighbor v ? Ij, for I? $ 1? since there are q(n) groups the claim on
the degree follows. Suppose for a contradiction that u has two neighbors Vj and
v2 in the same group Ij. From the definition of F it follows that dH(vl, v2) < 4,
an impossibility because then, by definition of Hj, (vi, v2) E E(Hj), and vi and v2
cannot both belong to Ii.
In order to show that J is a (3, 6)--Hruling forest w.r.t. P and H we need to show
that: i) any vertex u E P is at distance at most 6 from some v E J, and ii) for
any two vertices vi, V2 e J, dH (vi, v2) > 3. Let Pj be the group u belongs to; by
definition of I? and Hj, vertex u is at distance at most 4 from some vertex v e
If v also belongs to the final MIS J then we are done, otherwise v must be adjacent
31
to a vertex w E J in H[2, I], because J is an MIS of H. From the definition of F,
dH(v, w) < 2, which implies dH(u, w) <6. Finally, we show that any two vertices
in J are at distance at least 3 from each other. Let vi, V2 be any two vertices in
J; if they come from the same Ii then they are at distance at least 5 apart (by
definition of Hi), otherwise they are at distance at least 3 (by the definition of F
and J).
We now come to the complexity analysis. Step 1 is O(log n). Step 2 takes
4 Tcv(2C/q(n)) because we need 4 time units to simulate one edge of any Hi,
and each Hi has size at most 2?/q(n). The complexity of Step 3 is dominated
by the complexity of algorithm A, which is 6(p(n)). Hence, we get the following
recurrence
< 0(log n) + 4 T??			+
By substituting in the expression for Tcv(?) we get
Tcv(?) <
16 Tcv
log
(__l?g
0 p(n) exp
= 0 p(n)
= t) p(n) exp h(n)
We have thus shown the following theorem.
Theorem 3 Let h(n) be any non--Hdecreasing function and let p(n) = g(n)i/h(fl),
andq(n) = g(n)h(n) Suppose we have an algorithm A that computes a
decomposition of graphs with maximum degree A < q(n) in time O(p(n)). Then,
we can compute a (p(n),p(n))--Hdecomposition in time O(p(n)) for all graphs.
Chapter 3
The Local Nature of A--Hcolorings
and Its Algorithmic Applications
3.1 Introduction
In many situations, an independent set defines a set of processes that can compute
in parallel without mutual interference. A good vertex coloring (i.e., one that uses
few colors) partitions the vertices of a network into independent sets, and hence
defines a schedule for the processors to work in parallel. This makes vertex coloring
a useful primitive in parallel and distributed computing. Lynch and Choy & Singh
use vertex coloring to give distributed solutions to classical resource allocation
problems such as the Dining Philosophers of Dijkstra [CS92,Lyn81]
Given a graph G = (V; E) with V = n, A will denote its maximum degree,
i.e., the maximum number of neighbors of any vertex. A A--Hcoloring of a graph is
a vertex coloring that uses at most A colors. In this chapter we prove a surpris-
ing result about the "local" nature of A--Hcolorings, which has several interesting
consequences.
Theorem Let G be a connected graph such that A > 3, 0 is not a clique, and
32
33
all but one vertex v of G is A--Hcolored. Then, we can extend the A--Hcoloring to
the whole of G by recoloring a path originating from v, which is of length at most
O(log? n).
This theorem can be used to compute a A-coloring of G inductively by adding
vertices one by one and each time applying a "small radius search". Hence, this
result is a generalization of a well-known theorem of Brooks [Bro41] (see also
the discussion in Bollobas' [Bol79]), which states that every connected graph of
maximum degree A which is neither an odd cycle nor a complete graph, can be
colored with A colors. Brooks' proof does not appear to have this locality property.
The O(log? n) bound is tight up to a constant factor, in the sense that there exists
a family of graphs and partial A--Hcolorings of them, for which a search of radius
?(log? n) is required.
The small radius search can be carried out effectively in our distributed model
of computation and in NC, allowing us to derive several algorithmic results. The
intuition behind our algorithms is the following. Suppose a graph G is A-colored
except for a set of uncolored vertices P. If the vertices in P are sufficiently far
apart, we can extend the coloring to the whole of G by a simultaneous application
of the small radius search to all vertices of P. The problem is to construct a set
P with the desired property. We now give an overview of the various algorithmic
consequences of the small radius search that we establish in this chapter.
There is a simple and beautiful randomized distributed algorithm to compute
a (A + 1)-coloring in O(log n) expected time, due to Luby [Lub88J. Our "small
radius search" theorem leads to a reduction from A--Hcoloring to (A + 1)--Hcoloring.
This allows us to derive fast randomized distributed and NC algorithms for A-
coloring. We also prove that for any A > 2, the problem of A?dge coloring a
bipartite graph G needs ?(diam(G)) time distributively, even given an unlimited
amount of randomness (whereas this can be done in NC: see Lev, Pippenger &
34
Valiant [LPV81]). For paths and even cycles, edge coloring and vertex coloring
are equivalent and hence, we can state the following distributed version of Brooks'
theorem:
Theorem A connected graph G is A-colorable in expected polylogarithmic time in
the distributed model of computation if and only if G is neither a complete graph
nor a degree--H2 graph.
When A is bounded by a polylogarithmic function of n, we can implement the
reduction to (A + 1)--Hcoloring deterministically in polylogarithmic time, distribu-
tively.
Theorem A connected graph C is A--Hcolorable in O(A polylog(n)) time in the
distributed model of computation if and only if G is neither a complete graph nor
a degree--H2 graph.
By using ideas from [Lub88j, the randomized reduction can be implemented
and derandomized in NC with 0(1 v + I El) processors, yielding the first known
linear processor NC algorithm for A--Hcoloring. The existing NC algorithms for
A-coloring all seem to need superlinear processors (Hainal & Szemer?di [HS9o],
Karchmer & Naor (KN88], and Karloff [Kar89j).
Theorem A connected graph G can be A-colored in the PRAM model of compu-
tation with linearly many processors and in polylogarithmic time if and only is G
is neither a clique nor an odd cycle.
The reduction to (A+ 1)-coloring can be implemented sequentially with a depth
first search (DFS), which yields a linear time sequential algorithm for A-coloring.
35
The details of this construction do not appear in this thesis. In a sequential setting,
the small radius search yields a Las Vegas algorithm running in O(#n) time with
success probability of at least 1/2. The error probability can be made arbitrarily
small by repeated runs of the algorithm.
By making use of the notion of network decomposition we obtain one final
theorem.
Theorem A connected graph G is ?--Hcolorable in O(nE(fl)) time in the distributed
model of computation where e(n) = o(1/A???), if and only if it is neither a com-
plete 9raph nor a degree--H2 graph.
It is an important open problem whether a ?-coloring or a (?t 1)--Hcoloring can
be computed deterministically in polylogarithmic time in the distributed model of
computation.
3.2 Definitions
Given a graph G = (V, E) and a set 5 C V, G[Sj denotes the subgraph induced
?ys,andG?5denotesG[V?S].WhenS=(v1forsomevEV,wewflteG?v
instead of G --H
A vertex coloring will be denoted by x(?); if 5 is a set of vertices, then x(5) is
the set of colors used by the vertices of 5. The maximum degree of G is denoted
by A. When a vertex v is uncolored, we say that v is pebbled, and represent
this situation by letting x(v) =1. If 5 is a set of pebbled vertices and G --H 5
is legally A--Hcolored we say that C is partially A--Hcolored. We denote the set of
neighbors of a vertex v by N(v), and its degree by deg(v). If v is pebbled and
I x(N(v)) --H (IJ < A, then there is a spare color for v; if we color v with a spare
color, the pebble at v is said to be removed.
The following operations will be used often. Suppose that vertex v is pebbled
36
and Ix(N(v)) --H ?iJ = A; let u be any non-pebbled neighbor of v with color
a, say. A step is the following recoloring operation: x'(v) = a, x'(u) =1, and
= x(w), for all w E V --H fu, vJ. A very important property of the step
operation that will be used throughout the paper is that if P is the set of pebbled
vertices and a pebble makes a step from u to v, then this step operation transforms
a legal A--Hcoloring of G --H P into a legal A--Hcoloring of G --H --H ivJ) u (uJ). A
walk is a sequence of steps (see Figure 3.1). Clearly, a walk transforms a partial
A--Hcoloring into another partial A--Hcoloring.
The next definition introduces the class of graphs that we will consider for
A--Hcoloring:
Definition 1 A nice graph is a connected graph G which is not a complete graph,
with A >3.
3.3 Distributed Brooks' Theorem
In this section, we show that given a partially A--Hcolored nice graph G with one
pebbled vertex vo, we can extend the coloring to G by recoloring an "augmenting
path" of length O(log? n). Brooks' theorem follows as a corollary. For the sake of
clarity, we first give a weaker result, an O(#) bound, and then give the stronger
result.
We first establish our result in the easy case of when a vertex of degree less
than A is "near" the pebbled vertex. Let G = (V, E) be a graph of maximum
degree A; a vertex v E V is called a sanctuary if deg(v) <A.
Lemma 1 Let G be a nice graph with one pebbled vertex vo. Ifv C V is a sanctuary
at distance t from vo, then the pebble can be removed by walking it for at most e
steps.
PROOF. Let P = vo,vi,... , ve = v be any simple path between vo and v, and
consider the following procedure. Initially vo is pebbled; if there is a spare color
37
-?
pebbied vertex
Figure 3.1: Steps and walks
38
at vo we remove the pebble, otherwise we make a step to vi. Once vi is pebbled,
if there is a spare color at vi we remove the pebble, otherwise we make a step
to V2, and so on. This procedure is correct because each step maintains a partial
coloring. Eventually, unless a spare color is found along the way, we reach ve --H v
which has a spare color because its degree is less than A. E
Note that the search for a sanctuary within a distance of C can be easily im-
plemented in O(?) time both in the distributed model of computation and in the
PR?AM model using linearly many processors.
The rest of this section is devoted to establishing the existence of a short aug-
menting path in the case when there is no sanctuary near the pebbled vertex. A
graph with no sanctuary near the pebble must be "locally A--Hregular". The next
definition makes this notion precise.
Definition 2 Let C be a nice graph with one pebbled vertex vo G is A-regular
within radius  if there is no sanctuary at a distance of at most C from vo.
The following definition introduces the basic structure that allows us to extend
the coloring from G --H vo to &, when C is locally A-regular within a radius which
will be specified later.
Definition 3 Let G be a partially A--Hcolored nice graph with one pebbled vertex
vo. A T-path is a path P VO,Vi,... , v?, where Vp has two neighbors x and y such
that: (i) x(x) x(y), and (ii) x,y ?
Our aim is to prove that if there is no sanctuary within O(log? n) distance
from the pebbled vertex then there is a T-path of length O(log? n), and to show
how to find it. First, we show that a T-path allows us to extend the coloring to &.
Lemma 2 A partially A--Hcolored graph with one pebbled vertex vo and a T-path
can be A--Hcolored by walking the pebble along P.
39
PROOF. As in the proof of Lemma 1 we walk the pebble along P starting from
Vo. Eventually, unless we find a spare color along the way, we reach Vp, which has
two neighbors x and y with the same color, and whose colors are not changed by
the walk of the pebble.
Given two paths Pi vo,v1,... , vk and P2 vk,Vk+1,... , i'?, their concate-
nation is the path Pi ? P2 = vo,vi,. . .,Vk,Vk+i,. . .,i'?. The set of colors of the
vertices in a path P is denoted by x(P)- When I x(P) = 2 we call P bichromatic.
If P is bichromatic with colors c and ?, we say that P is an (Q, P)-path. In what
follows, the set of vertices of a path P will be denoted by the same letter P. The
next definition is crucial.
Definition 4 A stem is a simple path P = vo oP1. P2 such that: (i) vo is pebbled,
and (ii) P2 has at least foui vertices and is bichromatic.
The basic tool to prove our main theorem is a forthcoming lemma called the
Spawning Lemma. The lemma states that if G is A--Hregular within radius 3, then
given a stem P of length at most  we can either spawn off a bichromatic path p?
of length , or we can find a T-path of length at most 3. By repeated applications
of this spawning operation we can grow what roughly is a A-regular tree. If we
show that by growing the tree we always include only additional vertices then the
tree rapidly covers G and we can find a T-path of length O(log? n). Before the
proof of the Spawning Lemma we need a few more definitions and a lemma.
Definition 5 Let P = Vo,Vi,. . . ,Vp = Vo o+ P1 . P2 be a stem, where P2 =
Vj,Vj+1,... , VP. A P-detour is any simple path P' such that: (i) P1 has end--Hpoints
E P and at v5+k? fvi+i,...,vp?it, wherek>2 and(ii) P'flP=tVj,Vj+kl
If P' is a P-detour then Vo, .., Vj 0 P1 is a T-path (see Figure 3.2). In the sequel,
when there is no danger of confusion, we shall refer to a P-detour simply as a
detour.
v,
J
40
Q
Pebbled vertex
Figure 3.2: A detour yields a T-path
Let P = vo,..., Vp, and let P' be a path originating from some vertex v ?
If P n P' = ?v? we say that P and P' are divergent.
Definition 6 Let P = vo . P1 . P2 be a stem. An C-branch is any bichromatic path
originating from v E P which is simple, divergent from P and is of length .
The next lemma is used in the proof of the Spawning Lemma.
41
Lemma 3 LeIP = v0.P1.P2 be astem withP2 = Vi,...,Vp, andX(P?) = fe,PJ.
Let Q be a bichromatic simple path originating from x ? ?Vi+1, . . ,vp--HiJ. Then,
either Q is a Q -branch or there exists a T-path of length at most I PI + I Q I
PROOF. If there is a sanctuary in P u Q then we are done. Suppose not and
let, without loss of generality, x(Q) = f?,?J where ? ? p, and x(x) = o. Let x
and x+ be the two neighbors of x lying on P, with x being closer to vo than x+
Suppose that Q is not a Q I-branch and let w ? P A Q be the first point of P that
is met when walking along Q coming from x. If w ? ....... , vp?i1 then Q is a
P-detour and we are done. Otherwise, W = Vp and we distinguish two cases (refer
to Figure 3.3). Let A = vo,. . . ,x, and B = x,. .. ,Vp = W. Notice that B is an
(a, P)-path and recall Q is an (a, ?)-path. Then, it must be that x(w) = a.
o+ If N(w) A A = ? then either A. B or A. Q is a T-path of the desired length.
To see this, suppose that w has another neighbor u colored f3; then A. Q is
a T-path because u ? A u Q. Similarly, if w has another neighbor colored ?
then A.B is a T-path. If w has neither then both A.B and A.Q are T-paths
becauseIN(w)--H(BuQ)I=A--H2,andIx(N(w)--H(BuQ))I?A--H3.
. If N(w) A A # ? then a T-path of the desired length can be found as follows.
Again, we distinguish two cases. Let z E N(w) A A; if z = x then P' =
vo,..., x is a T-path, because x(x) = )c(w) = a. Otherwise, the path
= vo...., z, w.QR is a T-path, where QR is the path Q taken "backwards"
from w to x. In other words, there is a way of walking the pebble from vo to
x without touching x and x+. (Recall that x(x?) = x'(x+) =
E]
We can now state and prove the Spawning Lemma. From now on, let a, ? and
? be three distinct colors.
42
Figure 3.3: Situation when Vp ? Q fl P
Lemma 4 ?pawrnng lernrn4 Let 0 be a partially A-colored nice graph with one
pebbled vertex vo, and let 0 be A--Hregular within radius 3?, for any  > 5. Let S
v0.P1.P2 be astem of length at most? such thatP2 Vt,...,Vp, x(P2)
and x(vi) = a. Then, for any ? different from a and p, either there is an ?-branch
Pj+1 originating from Vi+1 such that x(Pi+i) = f?, ?J, or there is an C-branch Pj+2
originating from Vi$2 such that X(Pi+2) = fa? ?J, or there is a T-path of length at
most 3?.
PROOF. Let Pj+2 be the path obtained by following an (a, ?)-path Q2 starting
43
Vj+2
pebbled veflex
Figure 3.4: If Pj+2 is not simple we find a T-path
from Vit2 for  edges or till Q2 ends, whichever occurs earlier. We first show that
either Pj+2 is simple and divergent from the stem 5, or there is a T-path of length
at most 2?.
First, by Lemma 3 Pj+2 and P must be divergent, or there is a T-path of length
at most 2?.
Second, suppose that Pj+2 is not simple; then P' ..... . , Vi+2 ? Pj+2 contains
a T-path (see Figure 3.4).
If the length of Pj+2 is ?, then we are done. Otherwise, we show that either
44
there is an edge between Viti and the last vertex of Pj+2 or ..... . , Vi+2 ? ?i+2 is
a T-path. Let z and z be the next-to?last vertex and the last vertex of Pj+2,
respectively. First, notice that N(z) fl Pj+2 fz?, zJ because Pj+2 is simple.
Then, for vo,..., Vi+2 ? Pj+2 not to be a T-path there must be an edge between z
and vo,..., Vj+j. But if N(z) n ivo,. , vjj $ ? then there is a detour to Vi+2. The
only possibility then is that (vi+i, z) is an edge.
Let Pj+1 be the path obtained by following an (?, ?)-path Qi starting with the
edge (vi+i, z) and continuing for  edges or till Qi ends, whichever occurs earlier
Pj+1 has the same properties of Pit2 Namely, it is simple and divergent from the
stem S. By the same analysis it also follows that either ..... . , Vj+j 0 Pj+j is a
T-path or there is an edge between Vj and the last vertex of Pj+1. But now this
is also ruled out, or else we could reach Vi+2 with a detour starting from Vj that
uses Pj+1 and Pj+2 "backwards", resulting in a T-path of length at most 3 (see
Figure 3.5).			El
The above lemma is independent of the particular ? chosen as long as
fa, Pi, so that we can spawn off a total of A --H 2 new bichromatic paths, some
from Vi+i and some from Vi+2.
Corollary 2 Let U be a partially colored nice graph with one pebbled vertex, and
let 0 be A--Hregular within radius 3, for any C> 5. Then, given a stem S of length
at most , we can spawn off A --H 2 -branches from it, or else there is a T-path of
length at most 3.
Lemma 4 shows how to inductively generate new stems from old ones. The
next lemma shows how to find an initial stem; it is the basis of an induction proof
showing the existence of a short T-path.
Lemma 5 Let 0 be a partially A--Hcolored nice graph with one pebbled vertex Vo,
and let 0 be A--Hregular within radius 3, for any  > 5. Then 0 either has a stem
45
of length at least fottr or a T-path of length at most four.
vi
v
i+1?
v
i+2
P
i+2
pebbled vertex
Figure 3.5: If Pj+1 and i??t2 fail we find a T?path
PROOF. Since G is not a clique, vo has two neighbors x and y such that (x, y) ?
let x(x) = o? and x(u) = P Starting from vo we perform a walk along an (?, ?)-
path P according to the following procedure: let Vj be the current pebbled vertex;
if there is a free color at Vj then remove the pebble, otherwise make a step to a
vertex Viti not previously visited, such that x(vi+i) E fe, Pt. If no T-path is
found this procedure must perform at least four steps, because no sanctuary can
46
be found within 4 steps and the shortest (OL, P)-path between x and y must have
at least three edges.			0
By combining Lemmas 4 and 5 we can obtain Brooks' Theorem as a corollary.
Corollary 3 Every nice graph G can be A-colored.
PROOF. The proof is by induction on the number of vertices. The basis is trivial.
The induction step is to assume, for some vertex v, that G --H v is partially A--H
colored and that v is pebbled. If G is not A--Hregular then there exists a sanctuary
at distance at most n --H 1 from v and hence, by Lemma 1, we can extend the
coloring to v. Suppose then that G is A--Hregular. First we invoke Lemma 5 to get
an initial stem, then we invoke Lemma 4 by setting  n. Since branches of such
length cannot exist, we must find a T-path of length at most 3? and can remove
the pebble.			0
We now prove that if C? is a partially A-colored graph with one pebbled vertex
vo, then the pebble can be removed by a walk of length at most o(#). Let
3?. If there is a sanctuary at a distance of at most 3? from vo, then we are
done by Lemma 1; otherwise C is locally A-regular within radius 3? and we show
that a T-path of at most O(#) length must exist. Lemma 5 ensures that we can
find a first stem P of length four. Given P, with one application of Lemma 4,
we can can spawn off an tbranch P'. This gives a new stem S of length at most
I I I P = +4. Then, we subdivide P' into contiguous blocks of three edges each
(adjacent blocks share a vertex), and apply Lemma 4 in each block, thus generating
a sequence of bichromatic paths Qi,Q2,... , ?#n' each of length ?. Notice that if
any two distinct Qi and Qj intersect, then there is an S-detour of length at most
2, and hence a T-path of length at most 3 t 4 (see Figure 3.6). Any Qi which is
not divergent from the stem 5 yields a T-path of length at most 2 + 4.
47
Pebbled vertex
Figure 3.6: Qi and Qj cannot intersect
On the other hand, if all Qi's are divergent from 5, then there must be a pair of
-branches Qj and Qk that intersect each other, since we have spawned off at least
?/3 tbranches.
The basic idea of this proof is to generate a tree of diameter O(?) starting
from an initial stem and to spawn off -branches by repeatedly applying Lemma 4.
When a new tbranch Q? is spawned off, we either get a T-path if Q? intersects
the existing tree (this happens if Q? intersects other ?branches or is not divergent
from the stem) or we generate  brand new vertices. Clearly, after O(?) spawning
48
operations, no new vertices can be included, and the -branch must intersect the
existing tree.
A more intricate use of the same technique shows that we can generate a tree
of depth O(log? n) with the same properties, and hence show the existence of an
O(log? n) length T-path.
Theorem 4 Let G be a partially A--Hcolored nice graph with one pebbled vertex,
and let G be A--Hregular within radius 3?, where  7l0g???4 n + 11. Then, & has
a T-path of length at most 3?.
PROOF. We show how to generate a tree of depth O(log? n) which covers all the
vertices of &, unless a T-path of length O(log? n) is found. We want to generate a
sequence of trees tTk : k 0,1, 2,... ? all rooted at vo; Tk+1 is generated from Tk
by simultaneous applications of Corollary 2. The idea is that if P Vo,Vl,... , Vp
is a stem and if P' is a bichromatic path spawning off from Vj in P, then P11
VO,..., Vj? P' is a new stem. The new stem can be used to generate more stems,
and so on.
A first stem P of length 4 can be generated by Lemma 5. From P we generate
a bichromatic path P' = Wo,Wl... %V7 of length 7 where wo C P, and subdivide
it into two blocks B1 and B2, where B1 = w1,w2,w3,w4 and B2 = w4,w5,w6,w7.
Call B1 and B2 adjacent. Note that P' = wo, Wi ? B1 ? B2. The stem P with the
7-branch P' attached to it, forms the first tree T0. Each block Bi has two internal
points from which new paths can be spawned off. These points are W2 and W3
for B1 and W5 and W6 for B2. From Corollary 2, we can use each Bi to generate
A --H 2 new bichromatic paths of length 7. Each of these is subdivided again into
two blocks and each block is used to spawn off A --H2 new paths of length 7, and so
on. In this way, we generate a sequence of trees Tk rooted at vo. Tk+1 is obtained
from Tk by simultaneously spawning off the new length 7 paths from all the unused
blocks B in Tk. Assuming that the depth of Tk is at most ? = 7l0g???4 n + 11,
we will show that the depth of Tkt1 is at most C. We want to argue that Tk+1 is
49
a tree, that is, that the new paths do not intersect each other and do not intersect
the branches of the old tree Tk, or else we can find a T-path of length at most 3.
Let B be an unused block of Tk, x and y be its internal points with x(x)
a ? = x(u), and P be the stem B belongs to. Some of the new A --H 2 paths
might originate from x and some from y; an internal point which is the origin of a
new path is called a turning point. Let P' be one of the new paths, and let t be its
turning point; by Lemma 4, P' and the stem P are divergent, or there is a T-path
of length at most 3. Also, P' cannot intersect other branches of Tk or there is a
detour to t and hence a T-path of length at most  + 7 (see Figure 3.7).
Now, consider an (a, ?)-path Pa,? and an (a, ?)-path Pa,? originating from,
say, vertex x. If they meet, then they must meet at some vertex colored a, which
cannot be the last vertex of P?,? or Pa,?, since both of these last vertices are at
a distance of 7, an odd number, from x along their respective paths, and hence
will be colored ? and ? respectively. Hence, if P?,? and Pa,?j meet,they do so
at an internal vertex of each of these paths, and hence we find a T-path of length
at most + 7 (see Figure 3.8). An (a,?)?path originating at x and a (P,?)-path
originating at y cannot intersect because fa, ?? n fp, ?y ?. New paths from
different blocks cannot intersect, or else there is a path between two turning points
t1 and t2. In turn, this gives a detour between the least common ancestor of t1
and t2 (which is a vertex of Tk) and either t1 or t2. So, Tk+1 is indeed a tree. If we
collapse each pair of adjacent used blocks in Tk+1 into one vertex, we obtain a tree
where every collapsed vertex has degree at least 2A --H 4 but for the unused blocks
and the initial stem, and whose depth is reduced by at most a factor of 7; hence,
the depth of Tk+1 is at most . The process we have described has the property
that when Tk+1 is generated from Tk, either a T-path is found or the branches
that we spawn off only include new vertices. Clearly, after at most  iterations
(i.e., when we generate Te+i from T?) the new branches must intersect the old tree
because the whole graph is covered.
50
pebbled vertex
Figure 3.7: New bichromatic paths are divergent from the old tree
Note that for A > 3 7l0g???4n +11 O(log?n). Theorem 4 and Lemma 1
ensure that a short augmenting path can always be found. There is either a
sanctuary at a distance of at most 3 21 l0g???4 n + 33 from the pebble, or a T-
path of at most that length. It can be verified that both Lemma 1 and Theorem 4
can be implemented in O(log? n) time in the distributed model of computation,
and with linearly many processors in the PR?AM model.
51
pebbledvertex
Figure 3.8: Bicliromatic paths cannot meet
Finally, an ?(log? n) radius search is necessary in general, to remove a pebble.
Consider a rooted tree T in which every non--Hleaf node has degree A, and a partial
A--Hcoloring of T such that there is a pebble at the root, and for any non--Hleaf node
v, the colors of v's children are all distinct. The color of at least one leaf of T
must be changed to give the root a legal color, since at least one child of v must
be recolored, to recolor any non--Hleaf node v.
52
3.4 Algorithms for A-coloring
In this section, we show how the small radius search can be applied to the design
of efficient distributed and parallel algorithms for computing A-colorings. The
algorithms are based on a reduction from A--Hcoloring to (A + 1)--Hcoloring which can
be implemented distributively by an O(log3 n/ log A) expected time randomized
algorithm or by an O(A log3 n/ log A) time deterministic algorithm. These yield,
respectively, randomized and deterministic distributed algorithms for A--Hcoloring
with the same complexity bounds.
3.4.1 The Randomized Reduction
We now describe a randomized distributed algorithm for A--Hcoloring that runs in
O(log3 n/ log A) expected time. The idea is to first compute a (A + 1)--Hcoloring,
which can be thought of as a partial A--Hcoloring, and then to remove a color class.
An outline of the algorithm is as follows. Let G (V, E) be the network. First,
compute a (A + 1)--Hcoloring with colors 1, 2,... , A, i and pebble all vertices with
color I. Then, compute a (k, k log n)-ruling forest ? with respect to U and the set
P of pebbled vertices, where k clog n/ log A is more than twice the search radius
required by Theorem 4; this can be achieved by an appropriate choice of c. Recall
that the root of each tree in the forest is pebbled, and that each pebbled vertex
belongs to a unique tree in the forest. If we are able to remove all non-root pebbles,
then we can apply Theorem 4 in parallel on the roots, and by our choice of c, each
root will either find its own T-path or its own sanctuary, without interfering with
the other roots, and will remove the pebble.
The problem, then, is to remove all non-root pebbles. This can be achieved by
making use of a randomized process described below, which uses a slight modifi-
cation of an idea of Luby [Lub88]. The idea behind the reduction is to make all
pebbles walk to the root along the path specified by the tree; the pebbles are either
removed along the way if a spare color is found, or are eventually "absorbed" by
53
pebble motion
Figure 3.9: Steps must be scheduled
the root, which is itself a pebble. Each walk, however, is a recoloring operation
and we must ensure that in doing several of them in parallel, we always have le-
gal partial colorings of the graph. A symmetry-breaking problem arises when we
have adjacent pebbles; moving pebbles in parallel might result in an inconsistent
coloring (see Figure 3.9)
We now describe the randomized process which allows us to remove all non-root
pebbles. Each vertex in G has a list Av of available colors: Av ?1,..., A) --H
x(N(v)), for all v. We denote the current color of v by x(v) and the new color
after one step by x'(v). In parallel, each pebbled vertex v does the following:
Randomized Reduction.
If no neighbor of v is pebbled, then the pebble is removed if Av is
nonempty, and the pebble makes a step to v's parent if Av is empty. If
v has some pebbled neighbor, we say that v is asleep. With probability
1/2, v remains asleep and does nothing, i.e., V(v) = x(v) =1. With
probability 1/2 it wakes up, in which case v chooses a tentative color a
uniformly at random from A?; if a is also chosen by some neighbor of
v then x'(v) = x(v) =1, else x?'(v) = a and the pebble is removed.
First, we show that by executing the randomized reduction we never produce an
54
inconsistent partial coloring; second, we prove that the expected running time to
remove all non-root pebbles is polylogarithmic and in fact, that it is polylogarithmic
with high probability.
Lemma 6 Let G be a partially colored nice graph, and let P denote the set of
pebbled vertices. Let P' be the set ofpebbled vertices after one step of the randomized
reduction. Then, & --H P' is A--Hcolored legally.
PROOF. The claim follows from inspecting the randomized reduction. If a
pebble has no neighboring pebbles then it is removed if there is a spare color, or
it makes a step, if there is no spare color. In both cases the new partial coloring
is legal. For the case when there are neighboring pebbles let v denote the pebbled
vertex. A tentative color is assigned as the new color to v only if the same tentative
color was not chosen by any neighboring pebble. The correctness then follows from
the observation that non-pebbled neighbors can be pebbled but cannot change their
color.			E]
The next lemma shows that, with high probability, all non-root pebbles are
removed within O(log3 n/ log A) time: the failure probability is inverse polynomial.
With essentially the same proof it is possible to show that the running time is
O(f(n) log2 n/ log A) with probability at least 1 --H
Lemma 7 Let G be a partially colored nice graph with n vertices, P be the set of
pebbled vertices, andY be a (k, klogn)-rulingforest with respect to G and P, where
k is any positive integer. Then, if we run the randomized reduction, every non-root
pebble is removed within O(k log2 n) time with probability at least 1 --H 1/q(n), for
any polynomial q(').
PROOF. We want to set up the necessary machinery to invoke a theorem by
Karp that will give us the claim [Kar9li. First, we argue intuitively that if v E P
has some pebbled neighbor, then it is removed with probability at least 1/4; a
55
formal proof can be easily derived and can be found in [Lub88]. Let B = N(v) fl P
be the set of pebbled neighbors of v, and let W c B be the set of pebbled neighbors
of v in B that wake up. Since every pebbled vertex wakes up with probability 1/2,
the expected size of W is E[j WI] = I B 1/2. Every u ? W chooses a tentative color
uniformly at random from its list A?. In the worst possible scenario, all vertices of
W will choose a color which is also in Av. But since lAvi > IB = 2E[IwI], the
probability that v chooses a tentative color not chosen by any u E W is at least
1/2. The claim follows from the fact that v wakes up with probability 1/2.
We can summarize the algorithm by saying that when v is pebbled, the pebble
makes a step to the parent of v if there are no neighboring pebbles and there is
no spare color, it is removed if there are no neighboring pebbles and there is a
spare color, and it is removed with probability at least 1/4 if there are neighboring
pebbles. For the sake of the analysis, we modify the algorithm as follows: if v has
no neighboring pebbles and there is no spare color, the pebble makes a step with
probability p = 1/4, otherwise it is removed with probability p = 1/4. Given ?
pebbles, we want to study the random variable T(?), which denotes the time by
which every pebble has either made a step or been removed. Clearly, an upper
bound for T(.) with the new algorithm is also an upper bound for T(.) with the
old one.
Let h(?) be the random variable denoting the number of pebbles that after one
step are neither removed nor have made a step, then E[h(?] = (1 --H p). Then,
T(1) = 1 and T(?) satisfies the following recurrence
= 1 +
Let b = (1 --H p)?1 --H 4/3 and t(n) = U0g? nj + 1; u(n) is the minimal integer
solution to the recurrence
U(n) = 1 + U(E[h(n)]) = 1 + U((1 --H
Then by Theorem 3 of Karp [Kar91], we see that for any d > 1,
56
Pr(T(n) > u(n) + d) < pd?1
This probability is inverse polynomial when d = O(log n). Also note that this
upper bound on the probability applies to any configuration of the pebbles. Hence,
T(n)klog n is an upper bound on the time by which every pebble is removed,
because a pebble can walk at most k log n steps before being absorbed by the root
pebble; so the total time taken is O(k log2 n), with high probability. E]
We summarize the whole algorithm now.
o+ Compute a (A + 1)--Hcoloring of & with colors 1,2,... , A, I. This can be
done in O(log n) expected time distributively using a randomized algorithm
of Luby [Lub88].
o+ Pebble all the vertices with color I, and let P be the set of pebbled vertices.
Compute a (k, k log n)-ruling forest ? with respect to G and P, where k =
clog? n for a suitable constant c. This takes O(k log n) = O(log2 n/ log A)
time using an algorithm of Awerbuch, Goldberg, Luby & Plotkin [AGLP89J.
o+ Run the randomized reduction. At the end all non-root pebbles are removed.
This takes O(log3 n/ log A) time with high probability.
o+ Apply the small radius search on the roots in parallel. Each pebble will
either find its T-path or its sanctuary, and will be removed. This takes
O(log n/ log A) time.
The overall complexity is dominated by step 3. The correctness of the algo-
rithm follows from Lemma 6, which proves the correctness of step 3, and by the
correctness of the small radius search, which ensures that step 4 is correct. This
yields the following theorem.
57
Theorem 5 If G is a nice graph, then it can be A--Hcolored in the distribnted model
in O(log3 n/ log A) expected time.
3.4.2 Deterministic Distributed A--Hcoloring
In this section, we give a deterministic distributed A-coloring algorithm of complex-
ity O(A log3 n/ log A), so that when A is bounded by a polylogarithmic function
of n, the complexity is polylogarithmic.
In the previous algorithm, randomness was used in two places; to compute a
(A + 1)--Hcoloring and for the randomized reduction. The basic idea is that we can
substitute the randomized procedure of Luby by an O(A log n) time distributed
algorithm for (A + 1)-coloring, due to Goldberg, Plotkin & Shannon [GPS89].
To remove all non-root pebbles we use the fact that the graph induced by P, the
set of pebbled vertices at any given time, is itself (A + 1)--Hcolorable in O(A log n)
time. The coloring is used to schedule the motion of the pebbles, using the fact
that pebbles in a color class can safely take decisions simultaneously. (Recall that
a color class is an independent set.) We first give the algorithm to remove all non-
root pebbles--H the deterministic reduction, and then prove its correctness. As in the
previous section, prior to invoking the reduction we compute a (k, k log n)--Hruling
forest, where k = clog n/ log A for an appropriate value of c.
Deterministic Reduction.
Repeat k log n times (the maximum tree depth):
1. Let G[P] be the subgraph induced by the set of pebbles P. Compute a
(A + 1)-coloring of G[P] and let C1, .., CA+1 be the color classes.
2. Sequentially, for i = 1,2,... , A + 1 do the following: in parallel, each
non-root pebbled vertex v E Ci checks if x(N(v)) ? A. If so, a spare
color is chosen and the pebble is removed.
58
3. Let Q be the set of remaining non-root pebbles; in parallel, for each
pebbled vertex v C Q, if x(N(v)) I < A then color v with a spare color
and remove the pebble, else the pebble makes a step to v's parent.
In order to prove the correctness of this algorithm it is enough to show that each
of the k log n many iterations transforms a legal partial coloring into a new legal
partial coloring. Notice that the coloring of step (1) is used only to schedule the
operations of the pebbles and should not be confused with the partial coloring of C.
Step (2) gives a legal partial coloring because each color class Ci is an independent
set; if v E Ci none of its neighbors will change its color, and v can safely color itself
if a spare color is available. To prove the correctness of step (3) we first prove that
Q is an independent set. Suppose not, and let u and v be two adjacent pebbles in
Q. Without loss of generality suppose that u Ei Ci and v Ei Cj+k, where k > 0.
But since u had an uncolored neighbor when it was processed, namely v, it could
have colored itself then. Hence Q is an independent set. A pebbled vertex v in
step (3) either performs a step or colors itself; since Q is an independent set, for
all u Ei N(v), either u does not change its color or it becomes pebbled (i.e. some
pebble made a step to u). In either case the color assigned to v is legal.
We now argue that all non-root pebbles are removed by the end of the algorithm.
Consider any pebbled vertex v; for each of the k log n = O(log2 n/ log A) iterations,
either the pebble is removed or it makes a step towards the root, which decreases
the distance of the pebble from the root by one. Each iteration takes O(A log n)
steps (which is the complexity of Step 1), which gives a total of O(A log3 n/ log A)
time to remove all non-root pebbles. The whole algorithm is summarized as follows.
o+ Deterministically compute a (A + 1)--Hcoloring with colors 1,2,... , A, i in
O(A log n) time by using an algorithm of Goldberg, Plotkin & Shannon
[GPS89]. Pebble all vertices with color I.
o+ Compute a (k, k log n)-ruling forest with respect to C and the set of pebbles,
59
where k = O(log n/ log A). This takes O(k log n) time [AGLP89].
o+ Compute the deterministic reduction to remove all non-root pebbles. This
takes O(Ak log2 n) = O(A log3 n/ log A) time.
o+ Apply the small radius search to all pebbled roots in parallel. Every pebble
will either find its own T-path or its own sanctuary, and will be removed.
This takes O(log? n) time.
The complexity of this algorithm is dominated by that of the deterministic
reduction. Hence,
Theorem 6 If G is a nice graph, it can be A--Hcotored deterministicalty in the
distributed model of computation in O(A log3 n/ log A) time.
3.5 Further
Applications of the Small
Neighborhood Search
In this section, we discuss briefly some other consequences of the small radius
search. First, we show how the randomized algorithm of Section 3.4.1 can be
derandomized in NC with linearly many processors. Second, we discuss a deter-
ministic o(ne(n)) time algorithm for A--Hcoloring in the distributed model, where
e(n) = O(1/?gn).
3.5.1 A linear processor NC algorithm
The randomized algorithm for A-coloring can be implemented and derandomized
in the PRAM model using linearly many processors by making use of the standard
techniques discussed in [Lub88]. To our knowledge, this is the first linear processor
NC algorithm for A--Hcoloring; the existing algorithms seem to require superlinear
processors [HS9O,KN88,Kar89].
The distributed randomized algoritlim of section 3.4.1 has four steps. We now
describe how each of them can be implemented in the PRAM model.
60
Step 1 is the (A + 1)-coloring algorithm of Luby and can be derandomized with
linearly many processors [Lub88]
To implement Step 2, computing a ruling forest, and Step 4, performing the
small radius search, it is sufficient to simulate the message passing distributed
model in NC. This can be easily done by introducing a processor for each edge
and by noticing that the operations performed at each node only require 0(1) time.
Both these steps essentially require performing BF'S searches for O(log2 n/ log A)
and O(log n/ log A) depth respectively.
To implement the randomized reduction we consider the following modification
of Step 3. Let C be a partially A-colored graph with a set of pebbles P. We run
(the derandomized NC version of) Luby's (A + 1)-coloring algorithm on C, which
induces a (A + 1)-coloring of the pebbles with colors 1,... , A, I. Let C1 be the
pebbles that got color I; all pebbles in P --H C1 have a legal color and C1 is an
independent set and hence, all pebbles in C1 can make a step to the root.
Notice that here, unlike the distributed implementation, we first run the col-
oring algorithm and then all pebbles in Cj make a step. This requires a kind of
synchronization and global knowledge that is easily available in NC but not in the
distributed model.
Each run of the (A + 1)-coloring algorithm requires O(log3 n log log n) time
[Lub88] and we can have at most O(log2 n/ log A) runs before all non-root pebbles
in the ruling forest (whose trees have depth O(log2 n/ log A)) are removed. Paths
and even cycles can be easily colored in NC in O(log n) time; hence, we can state
the following theorem.
Theorem 7 A graph G can be A-colored in the PRAM model of computation
with tinearly many processors in O(log5 n log log n/ log A) time if and only if G is
neither a clique nor an odd cycle.
61
3.5.2 A Sublinear Time Distributed Algorithm
Problems like MIS and (A + 1)--Hcoloring can be solved in O(d(n) . c(n)) time dis-
tributively, given a (d(n), c(n))-decomposition of G. The generic algorithm for such
problems, given a network decomposition, will iterate through the color classes,
clusters of color 1 being "processed" first in parallel, clusters of color 2 being pro-
cessed next, and so on. Inside each cluster the following trivial algoritlim can be
used: the maximum ID vertex within the cluster is elected as leader, which then
collects information about all vertices in the cluster, solves the problem by itself,
and then distributes the solution to all vertices in the cluster. The bounds on
the cluster diameter and the number of colors used, yield the bound on the time
complexity of this generic algorithm.
It is known how to compute an (nE(fl), nE(fl))?decomposition distributively in
O(n?(?)) time where ?(n) = O(1/4mogn) [PS92b]. Such a decomposition can be
used to give a deterministic and distributed implementation of Step 1 and Step 3
of our A--Hcoloring algorithm.
Step 1, computing a (A + 1)-coloring, can be implemented with the generic al-
gorithm outlined above: cycle through the color classes, and when processing color
class c, extend the partial (A + 1)-coloring to all clusters of color c. The extension
to the coloring in each cluster of color c can be computed by the leader of the clus-
ter by means of global communication inside the cluster, with the time complexity
being proportional to the diameter of the cluster. Since both the number of colors
and cluster diameter are o(nE(n)), the total cost of this implementation is O(n?(?)).
A naive implementation of the reduction of Step 3 is as follows. Let t(n) =
O(log2 nI log A) be the maximum tree depth of the ruling forest, d(n) =
be an upper bound on the diameter of each cluster of the network decomposition,
and let c(n) = o(nc(n)) be the number of colors used in the network decomposition.
Then, for? = 1,2,... , t(n) and for c = 1,2,... , c(n) do the following: each leader in
clusters of color c schedules the motion of the pebbles inside the cluster until they
62
are either removed or step outside the boundary of the cluster. Inside each cluster
the trivial algorithm outlined above is used. This takes O(t(n)c(n)d(n)) =
time, where e(n) = O(1/?gn).
The main observation is that each time a cluster is activated, each pebble in the
cluster is either removed or makes at least one step. So, it is sufficient to activate
each cluster t(n) times to remove all non-root pebbles.
The correctness of the implementations of Step 1 and Step 3 follows from the
fact that a node in a cluster C cannot be adjacent to a node in a cluster C, whose
color is the same as that of 0. Step 4 of the algorithm can be implemented with
a BFS of depth O(log n/ log A) at most. The following theorem summarizes the
whole discussion.
Theorem 8 A nice graph G is A--Hcolorable in O(n?(?)) time in the distributed
model of computation, where e(n) = O(1/?ogn).
3.5.3 A Sublinear Time Las Vegas Algorithm
The proof of the existence of a T-path of O(#) length can be modified to give a
randomized procedure that with probability at least 1/2 succeeds at removing the
pebble.
As in the proof, we first generate an initial stem of length 4 and then apply the
Spawning Lemma with ? = 6#, to spawn off an tbranch S. S is subdiveded into
2# contiguous blocks, each made of three consecutive edges. Of these blocks, we
arbitrarily choose ? of them to form a collection!3. If we applied the Spawning
Lemma in each block to spawn off an ?branch at least one block would generate
a T-path or find a path to a sanctuary. We mark this "successful" block, remove
it from the collection B, and insert a new "unused" block in the collection. Again
one of them must succeed, since we have ? blocks. We mark this new block
and repeat. At the end of this procedure we have marked #n successful blocks.
llence if we pick one block at random among the initial 2?n-many blocks we pick
63
a successful block with probability at least 1/2.
The failure probability can be made arbitrarily small by repeated runs of the
algorithm.
The running time of this procedure is proportional to the length of the --H
branches that are generated, which is O(#).
3.6 Lower Bounds for some Distributed
Coloring Problems
In this section, we prove an ?(diameter(G)) lower bound for edge coloring bipartite
graphs optimally (i.e. with A colors) in the distributed model of computation, and
then show that the same lower bound applies even when the processors are allowed
randomness. When A = 2, coloring the edges is the same as coloring the vertices.
Linial has proved lower--Hbounds for computing various types of vertex colorings
distributively, using different techniques ?Lin92]
3.6.1 Deterministic Coloring of Paths
We first analyze the simpler case of coloring paths and then we will deal with
general bipartite graphs. In this case, two--Hcoloring the edges is the same as two--H
coloring the vertices. We will describe our lower-bounds in terms of vertex coloring.
Theorem 9 Let t(n) = o(n), and G be a connected graph with A = 2. Then,
there is no distributed protocol that computes a two--Hcoloring of the vertices of G
in O(t(n)) time.
PROOF. We consider the case where G is a path; the proof is similar if G is
an even cycle. The motivation for this result is that two--Hcoloring a path amounts
to computing the parity of a given vertex.
64
Let s(i, 1) be the state of vertex i (i is the ID of the vertex) at time 1. From the
definition of our computation model, it follows that for any path-coloring protocol,
s(i, t) = f(t, ?, ?L, ?R, s(i, 1 --H 1), 3(?L, 1 --H 1), 5(iR, I --H 1)),
where ?L and iR are the ID's of the neighbors of i. Also, s(i, 0) is the same for all
vertices i. The above equation implies that if d(i, j) is the distance between two
vertices i and j, any information starting from i needs d(i, j) steps to reach j. This
observation is the basis for the proof. Let 1 = 1(n) be the worst--Hcase complexity
of a protocol P for tw?coloring paths, and assume that 1(n) = o(n). Consider a
path A : vi, .., V2?, .., Vi?t, .., Vj, ..Vi+t, ..Vn?i, v?; notice that Vj is surrounded by a
neighborhood of radius 1 and it is at least 31 t 1 far away from vi. Consider now
the path where v? is inserted somewhere between v2t and v?--Ht; this changes the
parity of the path, i.e. the coloring of vi and Vj in A must be different from that in
B, when the protocol P is used. But from the state transition function it follows
that
sA(i,k)=sB(i,k) AsA(vi,k)=sB(vi,k) O<k<1
where the subscripts A and B denote the paths A and B. In other words, for any
sublinear 1(n), the states of Vj and vi in A and B will be exactly the same and
hence they will receive the same colors in both cases, contradicting the presumed
correctness of P.			0
3.6.2 Optimal Edge Coloring of Bipartite Graphs
In this subsection, we prove the same lower bound of ?(diame1er(G)) for edge
coloring general bipartite graphs optimally, i.e., with A colors. The idea of the
proof is the same as before; if a protocol is constrained to finish within 1 steps, then
a vertex cannot "tell the difference" between two situations where the topology
65
Figure 3.10: A 5-widget
of the network is the same in a neighborhood of radius t, but not outside this
neighborhood.
Theorem 10 For any A > 2 there is no subdiametric time deterministic protocol
for edge coloring an arbitrary graph of maximum degree A with A colors, in the
distributed modeL
PROOF. The proof is by contradiction. Our graph G will be made by linking
together in a chain-like manner certain subgraphs. Each subgraph is defined as
follows. Consider a complete bipartite graph KA?1,A?1 and let b1, .., bA?i be the
vertices on one side of the bipartition and ci, .., cA?1 the other side. Connect all
of the bj's to a vertex a and all of the c?'s to another vertex d. Finally, connect d
to another vertex e. Such a graph will be called a A-widget. A 5-widget is shown
in figure 3.10.
The widget is such that it forces a color on edge (d, e), as follows. Recall that
since a A-widget is a degree A bipartite graph, it has a A--Hedge coloring. Without
66
Figure 3.11: A chain of widgets
loss of generality, suppose we use colors 1,.., A --H 1 for the edges incident on vertex
a. This implies that if we consider the remaining edges incident on any bi, then
exactly one of them must use color A, which means that each Cj has exactly one
edge (ci, bj) colored A. In turn, this means that none of the edges incident on
vertex d can use color A and this forces to color the color A on edge (d, e).
If we connect A-widgets in a chain-like manner so that the e vertex of one
widget coincides with the a vertex of the next widget, we are going to have a
degree A bipartite graph.
We are now ready to prove our claim. Suppose there is a protocol ? to A--H
edge color bipartite networks with n vertices and with maximum degree A, with
subdiametric worst--Hcase time I = t(n, A). Consider a chain A with at least 31 + 3
A-widgets. Let W? be the ith-widget starting from the left, and let a?, 6??, Cij, di be
its vertices. Without loss of generality, assume that all edges (di, aj+i) are colored
with color A. We now modify the chain A by removing any vertex v from the last
67
widget W3t+3 from the left and inserting v between dt and at+1; we have the two
new edges (dt, v) and (v, atti). Let this be chain B. Clearly, the insertion of v
implies that color A cannot be used on both edges incident on v; this implies that
the coloring of the two subchains at the left and right side of v must be different.
However, if we consider widgets W1 and W2t+2, they will behave exactly the same
as they did in chain A since they are at distance greater than t from v and from
W3t+3, and this will cause a conflict of colors somewhere in the chain.
3.6.3 Randomized Coloring
We now prove that the same ?(diam(C)) lower bound applies when the processors
are allowed to use random bits. At each step of the protocol, each processor can
flip a fair coin independently any number of times; this is equivalent to assuming
that each processor is given all of its random bits at the beginning of the protocol.
There are two types of randomized protocols: Monte Carlo and Las Vegas. The
definition of acceptance for a Monte Carlo protocol is that the protocol should find
a A--Hedge coloring in any degree A bipartite graph with probability at least 1/2.
On the other hand, a Las Vegas protocol always computes the correct answer, but
its running time is a random variable.
Theorern 11 There zs no Monte Carlo distributed protocol that finds a two--Hcoloring
of a path with n vertices in worst--Hcase time o(n).
PROOF. We use the same strategy as for the previous proof. Assume that
there is a protocol ? which violates the assumptions of the theorem; given a path
A where ? is supposed to work, construct a new path B by changing the parity of
two vertices. If we take these two vertices far enough they will behave in the same
way in both chains, and the resulting coloring in B will be invalid.
Let A be a path such that n/4 > 2t (where t = o(n) is the worst--Hcase time
complexity of ?), and let us subdivide it into 4 parts of equal length:
68
A1			A2			A3			A4
A = . . .Vn/4V(n/4)+1.. .V2n/4V(2n/4)+1.. .V3n/4V(3n/4)+1.. .V?.
Let the parts be A1, A2, A3, and A4. Let b be the number of random bits
assigned to each processor. Given the random assignments to the processors
bits to each of the n processors) we form a string of length bn by concatenating the
assignments; we call this a string assignment. Since the protocol works on A with
probability at least 1/2, there must be at least 2bn--H1 string assignments that ?nd
a right coloring; let 5 be this set of string assignments. Consider H1 = A1 u A3
and H2 = A2 u A4. Let Si = tai 0 a3 E tO, 1Y?'2 I ?x, y Ei tO, 1Jbn/4 ?s ? 5 s =
aioxoa3oyj and let ni = 1511. Similarly, let 52 = ta?0a? Ei t0,1lbn/2 I ?x,y E
t?, 11bn/4 ?? E 5 5 = X 0 a? 0 y 0 a?J and n? = 52. Without loss of generality,
suppose ni > n2; since 151 < ISi x 521, nin2 > 2bn--H1 ?? follows that
bn--H1
ni > 2?.
Let us now construct a new path B by removing V?, the last vertex of A4, and
inserting it in the middle of A2. We now claim that the probability that in B the
vertices in H1 compute exactly the same colors for themselves as they compute in
A is at least 1/72. This would give us the claim.
Notice that since V? is at distance greater than t from the vertices in H1, the
vertices in H1 will compute an invalid coloring when given any sequence of random
strings from 51, for any assignment of random strings to H2. This happens with
probability
which is the desired probability.
bn 2?n--H1
ni?f > 24$ = 1
$2'
c
Theorem 12 There is no Las Vegas distributed protocol that finds a two--Hcoloring
of a path with n vertices in expected time o(n).
69
PROOF. Assume that there is such a protocol ? with expected running time
at most T(n) = o(n). Given a path, run on it for 2 T(n) steps; by Markov's
inequality, a valid two--Hcoloring would have been found with probability at least
1/2 in time 2 T(n) = o(n), violating Theorem 11.			0
Corollary 4 There is neither a Monte Carlo distributed protocol that finds a A--H
edge coloring of an arbitrary degree A bipartite graph in subdiametric time, nor is
there a Las Vegas distributed protocol that finds a A--Hedge coloring of an arbitrary
degree A bipartite graph in subdiametric expected time.
PROOF. The same arguments as in Theorems 11 and 12 go through if we use
the chain of widgets of Theorem 10 instead of a path.			0
Chapter 4
Fast Randomized Edge Coloring
Via an Extension of the
Chernoff-Hoeffding Bounds
4.1 Introduction
The edge coloring problem can be used to model certain types of jobshop schedul-
ing, packet routing, and resource allocation problems in a distributed setting. For
example, the problem of scheduling 1/0 operations in a certain parallel architecture
can be modeled as follows [JSWB92]. We are given a set of processes ? and a set of
resources I? such that each process p e ? needs a subset f(p) C ? of the resources
where: (i) each process p needs every resource in f(p) for a unit of time each, and
(ii) ri can use the resources in f(p) in any order. From this, we can construct a
bipartite graph Up,? = (?,?,Pp,?) where Ep? = ?(p,r)j p E ? Ar E f(p)J.
An edge coloring of Up,? with c colors yields a schednle for the processes to use the
resources within c time units. Optimal colorings correspond to optimal schedules.
Edge coloring can also be used in distributed models in situations where broad-
casts are infeasible or undesirable: an edge coloring of the network results in a
70
71
schedule for each processor to communicate with at most one neighbor at every
step; at time step i' processors communicate via the edges colored i only. Using a
"small" number of colors reduces the wastage of time in this schedule.
Note that A colors are necessary to edge color a graph with maximum degree
A. Vizing showed that it is always possible to edge color a graph with A + 1 colors
and gave a polynomial time algorithm to compute such a coloring [BolT9i. Efforts
to parallelize Vizing's theorem have failed so far. The best known algorithm is an
R?NC algorithm of Karloff & Shmoys that uses A + o(A) colors. The algorithm
can be derandomized in NC by standard techniques by Berger & Itompd, and
Motwani, Naor & Naor [BR91,MNN89]. In the distributed model, the best known
edge coloring algorithm is to apply a vertex coloring algorithm to the line graph
L(G) of G. We saw in Chapter 3 that there are fast (polylogarithmic) randomized
(A + 1)--H and A--Hvertex coloring algorithms which translate to (2A --H 1)- and (2A --H
2)--H edge coloring algorithms respectively [Lub88,PS92b]. There are no known
polylogarithmic deterministic algorithms in the distributed setting for (2A --H 1)--H
edge coloring [AGLP89,PS92b]. Moreover, as we saw in Chapter 3, distributed
A--Hedge coloring for bipartite graphs requires ?(diameIer(C)) time, even with
randomization[PS92b].
In this chapter, we present fast and simple randomized algorithms to edge color
G with at most (1.6 + ?)A + 0.4 log2+? n colors with high probability for any fixed
t, ?> 0, where A is the maximum degree of the vertices of G. At the heart of our
analysis is an extension of the Chernoff--HHoeffding bounds, which are key tools in
bounding the tail probabilities of certain random variables [Che52,lloe63,Rag90].
The edge coloring algorithm is based on a very simple randomized algorithm
to color bipartite graphs. The algorithm can be explained in a few lines. Given
a bipartite graph G (A, B, E) with maximum degree A, each vertex in B picks
distinct colors from (1,2,... , Al at random for its edges without replacement, i.e.,
edges incident to the same vertex get different colors. Then, each vertex v E A
72
checks, for each color ?, if more than one of its incident edges has color a and
if so, chooses one of them at random as the winner, and all the other edges of
color a which are incident to v are decolored. The key claim is that for every
vertex, the number of decolored edges incident to it is at most A(1 t e)/e with
high probability (e is the Euler number). Assuming that this holds, we can repeat
the above iteration with a set of A(1 t e)/e fresh colors, and so on. In spite
of its simplicity the algorithm requires an interesting probabilistic analysis which
requires requires an extension of the Chernoff-lloeffding bounds to a certain case of
dependence among the random variables that we call A--Hcorrelation. This extension
has several other applications besides the ones given in this thesis [Sri93].
4.2 Notation
In this section we introduce some notation we will be using in the paper.
Given an undirected graph G (V, E) we denote by A its maximum degree,
i.e., the maximum number of edges incident with any node; by d? we denote the
degree of vertex u, and by 6(u) we denote the set of edges incident with vertex u.
Given a positive integer n, [n] denotes the set ?1,2,... , nJ. Given a matrix
M with c columns and r rows where c ? r, the generic entry of M is denoted
as Mik where i denotes the column number and k the row number. (This is the
reverse of the usual definition but it is more natural when we define the permanent
of M.) The permanent of a (possibly non--Hsquare) matrix M is defined as the
natural extension of the permanent of square matrices. Let ? = : [c] H
7r is one--Hto--Honej. Then,
perm(M) = ? M1,?(1) M2,?(2) ... Mc,ir(c).
7rEP
An event A is said to happen with high probability (w.h.p.) if Pr(A) > 1--H1/f(n)
for some superpolynomial function f(n) (i.e., nc = o(f(n)) for all c > 0, and the
failure probability of A is bounded by 1/f(n)).
73
4.3 The Chernoff-Hoeffding Bounds Extension
In this section we introduce our extension of the Chernoff-lloeffding bounds which
are important tools used in estimating the tail probabilities of random variables.
Given n independent random variables Xi, ..... . , Xn, these bounds are used in
deriving an upper bound on the upper tail probability Pr(X > (1 + e)?), where
X ?2?=1 Xj, [t = E[X], and e > 0. We extend these bounds to a certain case
of dependency among the Xj's, which we call A--Hcorrelation. Since the general
definition of A--Hcorrelation is involved, we first introduce this notion in the case
where the Xi`s are 0-1 random variables and prove our extension of the Chernoff-
lloeffding bounds. This special case is simpler to handle and more intuitive but
contains all the essential difficulties of the general case. After the special case is
dealt with, we will take care of the more general case.
Before giving our extension of the Chernoff-lloeffding bounds let us review
Chernoff's classical approach to upper bound the upper tail probability of a random
variable X, which is the sum of independentbinary random variables x1, x2,... ,
[Che52]. Chernoff's basic idea is to use Markov's inequality on the random variable
etX for an arbitrary, but fixed, t > 0, and minimize with respect to t. That is, to
use the fact that
Pr(X > (1 + e)ji) = pr(etX > et(i+c)P)
E [etX]
and minimize the last ratio for t > 0 [Che52]. This is achieved by finding a good
upper bound for the numerator E[etX] by using the fact that X is the sum of
independent random variables. It is standard (see, e.g., R?aghavan[Rag90]) to use
this for showing that in this case,
E[e?X] ? F			=			e?
m?>wet(l+()? --H			?			(1 + e)i+e
(4.1)
74
Hoeffding [Hoe63] considers a more general case where X is the sum of n inde-
pendent and bounded random variables Xi ? [aj, bi], and uses the above approach
to show that if E[X] = j, then for e> 0,
mi E[etX]			2?2e2
?>0n et(l+()I ? G(#1 e, a? b) = exp --H			--H ai)2			(4.2)
- ZiE[n](bi
Equations 4.1 and 4.2 will be used in our proofs. Henceforth, we refer to these
bounds of Chernoff and Hoeffding as the CH-bounds. If e is a fixed positive quantity
no greater than 1 (which will be true in all our applications), then F(u, e) ?
Hence, if ? = ?(log1+? n), then F(u, e) is the inverse of a superpolynomial function
of n, for any fixed 6 > 0 (similar considerations apply to G(jt, e, a, b?). This fact
makes the Cli-bounds a powerful tool for deriving strong performance guarantees
for randomized algorithms and will be used repeatedly in this paper.
4.3.1 0-1 Random Variables
We now present our extension of the CH-bounds to the case of 0-1 A--Hcorrelated
random variables. We begin by defining A--Hcorrelation1 for 0-1 random variables.
Definition 7 Let X1, ..... . , Xn be binary random variables. The Xj `s are A--H
correlated if, for all nonempty I C
Pr AXi=1 ?ARPr(Xj=1).
i?I
Before proving the extension of the CH-bounds, let us develop some intuition
by giving an example of A--Hcorrelation. Suppose we have A balls that are thrown
uniformly and independently at random into A bins. Let Bj be an indicator random
variable that is 1 if bin i is empty and 0 otherwise. These variables are 1--Hcorrelated.
To see this, consider a subset J C [A] of bins and suppose that all these bins are
empty. Then, given this information, the prnbability that some other bin, say z,
remains empty decreases. That is, for all J c [A],
`This was defined as "self?weakening with parameter A" in [PS92a].
75
Pr(Bj =11 A Bj =1) =
j?J
1?A)H
Pr(Bi = 1)
by straightforward induction, this implies that for all nonempty I C [A],
iEJ			=1 ??tH?iP?(Bi=l)
The Cll-bounds say that if x is the sum of n independent 0-1 random variables
then Pr(X > (1 + e)jt) ? F(1i, e). The next theorem shows that if the Xj's are
A--Hcorrelated then Pr(X > (it e)ji) ? A F(jt, e). In the statement of the following
theorem, jt is an upper bound on E[X]; this is sufficient because we are upper
bounding the upper tail probability of X.
Theorem 13 Let X be the sum of A-correlated 0-1 random variables, and let
E[X] ? ?. Then,
Pr(X > (1 + e)?) < A F(i, e).
PROOF. As in the classical proof, we start by introducing a positive parameter
t and by applying Markov's inequality to the variable etX
Pr(A' > (1 t e)j?) = pr(etX >
< E[etX]
To find a good upper bound on the numerator (we cannot use independence
of the Xi `s) we wright down its Taylor expansion, and use the fact that etX is
bounded by e?? to apply linearity of expectation to the infinite series:
76
E[etX] = E
oo tkxk
oc tkE[xk]
Focus on the generic term E[xk] of this senes. From the fact that X Z Xj and
that the Xj's are 0-1 random variables it follows that E[xk] is a linear combination
of the form
F[xk]			z ?? P? A Xj = 1
for I # ?, and where all o?i's are non-negative (this can be verified by unfolding Xk
and by observing that Xf = Xj for all i and all c ? 0). In order to upper bound
the term F[xk] we introduce n twin 0-1 random variables Xj which have the same
distribution as the Xj's except that they are independent. From the hypothesis of
A--Hcorrelation and from the fact the twin variables are independent it follows that
ilence,
Pr )1Xi=1
A HPr(Xi=1)
i?I
= A HPr(%=1)
i?I
= APr Aki = 1
i?I
p[xk] ? A k[kk].
By upper bounding each term of the series expansion of E[etX] we get
? A E[etkl.
Notice now that k is the sum of n independent random variables and F[kl =
E[Xl < ?. Hence, by equation 4.1
77
Pr(X> (1 + e)?) <
E[etX]
A E[etk]
A F(a,e).
To see an application of this theorem, let us go back to the balls-and-bins
experiment, and suppose we are interested in estimating the number of empty bins
after the balls are thrown; this number is given by the random variable B ?j Bj.
By linearity of expectation and by the fact that the balls are thrown at random
independently of each other,
E[B]			? E[Bjl
z
iE[Aj
A
e
We already saw that the Bj's are 1--Hcorrelated. llence, by Theorem 13,
Pr(B > (1 + e)A/e) < F(A/e, e).
Remark. Jain has proved the following lemma [Ram90]: Let ai,a2,... , a? be n
random trials (not necessarily independent) suHi that the probability that trial a?
`succeeds' is bounded above by p regardless of the outcomes of the other trials. Then
if X is the random vartable that represents the number of `successes tn these n
trials, and Y is a binomial variable will' parameters (n,p), then: Pr[X >
Pr[Y > k], 0 ? k ?
The assumptions of Jain's lemma are strictly stronger than those of i--Hcorrelation.
For instance, in the balls and bins example,
78
Pr(B?=1I			A Bl=o)??A;il,
?E[A--H1]
which, for A > 3, is greater than Pr(B? 1) ( 1/e).
4.3.2 The General Case
In this section we introduce the more general definition of A-correlation and prove
the more general extension of the Cli-bounds.
The proof of Theorem 13 is based on the observation that if we can upper
bound each term E[xk] of the Taylor expansion of E[etX] by A E[xk] where X is
the sum of independent random variables, and if E[etk] ? B, then F[etX] ? A B.
This motivates the following definition.
Definition 8 Let X1,X2,... , Xn be bounded random variables such that Xj E
[a?, bi] and let X ZiE[n] Xj. The Xj's are A-correlated if there exists a collection
of independent twin random variables Xi E [a?, bi] such that,
o+ E[X] ? E[k], where k Zi?[n1 Xj, and
o+ for all nonempty I C [n] and positive integers ?1,?2,?? ,
F[HXs.i] ? A HEKsi].
jEl			jEl
If we apply this definition to the case of 0-1 random variables we see that to
have A--Hcorrelated 0-1 random variables, it suffices to find twin variables Xj such
that E[X] ? E[k] and, for all nonempty I C
Pr AXil ?ARPr(A?--H1)
jEl
which reduces to definition 7 when Pr(Xj 1) Pr(X? = 1). Notice that for this
to hold, it is not necessary that Pr(Xi = 1) = Pr(ki = 1); in typical applications
79
we do not know the exact value of Pr(Xi 1) but only an approximation (usually
an upper bound; we wilt see examples in later sections), but this is good enough
to apply the Cil-bounds extension.
Theorem 14 Let X be the sum of A--Hcorrelated random variables Xi,X2,. . .
where Xi C [ai, bi] and let X be the sum of the n twin variables Xi. Then,
Pr(X > (it e) E[k]) ? A G(?,e,afflb).
PROOF. Let jt = E[X]. As in the proof of Theorem 13, we start with the
inequality
Pr(X > (1 + e)?) = Pr(etX >
E[etX]
By the hypotheses of boundedness and of A--Hcorrelation it follows that
E[e?X] =
oc 1k?k
E ?
tkE[Xk]
ffl tkp[Xk]
= AE[etX]
By the already discussed result of lloeffding [lloe63j, when k is the sum of n
independent bounded random variables Xi E [ai, bi]
min E[etk]
?>O et(l+?)tt ? C(n,e,a,b).
E]
In this paper we will use the special case of Theorem 14 where Xi C [0, 1],
i e [n]. In this case, F(?, e) is also an upper bound for the upper tail of X.
80
Corollary 5 Let X be the sum of n A--Hcorrelated random variables Xj E [0,1].
Then,
Pr(X > (1 + e)E[X]) < A F(jt, e).
PROOF. Let E[X] jt. When X is the sum of n independent random variables
Xi Ei [0, 1] lloeffding shows that, for I Ei (0,1 --H i/n), (cf. Theorem 1 of [lloe63])
n-1-nt
Pr X$iL>t
By			setting e			nt/n and by applying the standard approximation 1 + r ? ez for
x			nt/(n --H			--H ni) we get
Pr(X > (1+ e)jt) ? (1+			F(jt, e).
4.4 The Edge Coloring Algorithm
We now present our randomized distributed edge coloring algorithm. The algo-
rithm uses an idea of Karloff & Shmoys to reduce the problem of edge coloring
general graphs to that of edge coloring a special class of bipartite graphs [KS87].
The Karloff & Shmoys algorithm was proposed for the PRAM model and is
as follows. The input to the algorithm is a graph G (V, E) of maximum degree
A(G) -- A.
1. Compute a random partition of V into black and white vertices (all vertices
flip a fair coin independently and in parallel). Let G[B] be the subgraph
induced by the black vertices, &[W] be the subgraph induced by the white
vertices, and G[B, W] be the bipartite subgraph formed by the edges having
endpoints of different colors.
81
2. Optimallyedge color G[B, W], i.e., with A(G[B, W]) colors, using the PR?AM
algorithm of Lev, Pippenger & Valiant [LPV81]
3. Recurse on C[B] and U[W] using the same set of fresh new colors on both
graphs.
The key fact is that, as long as A is large enough, the maximum degree of
the three subgraphs is at most A/2 + A'12+? A/2 + o(A) w.h.p., for any fixed
e> 0. (This can be proved by applying the standard CH-bounds.) Note that the
same set of fresh new colors can be used on both G[B] and G[W] because G[B, W]
completely separates the two graphs. The recurrence for the number of colors used
is
C(A) ?			+ Al/2+c) + c			+ Ai12+?)
< A+o(A)
and holds w.h.p. The "high probability" statement holds as long as A ?(log1+? n),
for any fixed ? > 0. In this case, the failure probability given by the CH-bounds
is the inverse of a superpolynomial function. For the case when the degree goes
below the ?(l0gi+?) threshold, Karloff & Shmoys give a PRAM algorithm to
(A + 1)--Hedge color graphs of this low maximum degree [KSs7].
The non--Hdistributed parts of the Karloff & Shmoys scheme are the subroutine
which colors the bipartite graph optimally (it can be shown that this is impossible
in the distributed model of computation [PS92b]), and the handling of the case
A ? log1+? n. The latter can be handled by Luby's randomized distributed (A+1)--H
vertex coloring algorithm [Lub88]. Luby's algorithm is simulated on the line graph
of the given graph G thus giving a 2A(&) --H 1 edge coloring2. Hence, if we could
2When A(G) = O(polyJog(n)) we can compute a 2A(G) --H 1 colorings deterministicalty in
O(polylog(n)) time with an algorithm based on the idea of removing maximal matchings. We
prefer to use Luby's algorithm here for conciseness.
82
get a good distributed algorithm for bipartite edge coloring then we could use the
Karloff & Shmoys scheme to get a good coloring of any graph. We now show how
this can be achieved.
4.4.1 Distributed Edge Coloring of Bipartite Graphs
We now present a simple Monte Carlo algorithm for edge coloring the special type
of bipartite graphs generated by the Karloff & Shmoys scheme.
Given a bipartite graph G (A, B, E) we assume that each vertex knows
whether it belongs to A or B. This is an important assumption because such
information cannot be computed fast distributively [PS92b], but it is verified by
the bipartite graphs generated by the Karloff & Shmoys scheme. From now on, we
will refer to vertices in A as the top vertices and to the vertices in B as the bottom
vertices.
The algorithm takes O(log n) time and colors a bipartite graph C of maximum
degree A with at most 1.6A + i?1og2+? n colors w.h.p., for any ? > 0, which is
approximately 1.6A + log2 n (the failure probability will depend on ?). In the
algorithm we use a variable Ac that is initialized to A(C). During any iteration
of the algorithm, Ac is meant to be an upper bound on the degree of the current
graph; we will prove later that this holds w.h.p. Recall that ?(u) denotes the set
of edges incident on vertex u, and let THRESHOLD log2+? n.
The algorithm is given two parameters ?, ?> 0, and is as follows:
1. PART I: Repeat until Ac <THRESHOLD:
Let Cc be the current graph. Pick a set ? of Ac fresh new colors.
o+ (Random proposal of bottom vertices) In parallel and independently of
the other vertices in B, each vertex v E B assigns a temporary color
to each edge in ?(v) with uniform probability without replacement, i.e.
edge el is assigned color a ? ? with probability 1/Ac, e? is assigned
p e --H faj with probability i/(Ac --H 1) and so on.
83
o+ (Lottery of top vertices) (Comment: The coloring so far is consistent
around any vertex v ? B but can be inconsistent around a vertex u E A.)
For each u E A, let Cn(o) be the set of edges in ?(u) with temporary
color ?. Each vertex u E A selects a winner uniformly at random
in C?(o), for each nonempty Cu(o). All other edges, the losers, are
decolored and assigned I.
o+ Set Ac := Ac(1 t e)/e. G1, the subgraph of Gc induced by the losers
(i.e., by the I-edges), becomes the new current graph.
2. PART II: Let Ur be the remaining graph. Edge color Gr with 2A(&r) --H 1 <
2 THRESHOLD colors by executing Luby's vertex coloring algorithm on the
line graph of Ur [Lub88]
The key claim is that in every iteration of part (I) above, the maximum degree of
the graph shrinks by a factor of at least (lte)/e w.h.p., as long as A > THRESHOLD.
That is
A(&1) ?(1+e)
e
with high probability, for any ?xed e> 0. The condition Ac > THRESHOLD ensures
that the failure probability given by the extension of the CH-bounds is the inverse
of a superpolynomial function. The reason for setting THRESHOLD = log2+? n will
be apparent from the probabilistic analysis.
Once the key claim is established, we can bound both the total number of colors
used and the running time of the algorithm. To bound the number of colors used
observe that if the degree of the graph shrinks at every iteration by at least a
(1 + e)/e factor w.h.p. then the maximum degree of Cr is at most log2+? n w.h.p.
Hence, w.h.p., the number of colors used is at most
C(A) ? A+$(1 +e)+ ...+)(i +e)k+2l0g2+?n,
84
where k is the smallest integer such that A(1+?)k/ek < log2+? n. The total number
of colors used is at most (here, ?? depends on ? and can be made arbitrarily small)
C(A) ?
? 1.59A
1 +			A + (2 --H			1 --H E')			log2t5n
1.585 A +0.4 log2+6 n
when A > 8 log2+? n. The running time of the algorithm is O(log n) because
part (I) takes O(logA) time (by the key claim), and part (11), i.e., Luby's algo-
rithm, takes O(log n) time.
If A > 8 log2t? n and we use our distributed subroutine for bipartite graphs
in the Karloff & Shmoys algorithm the total number of colors is given by the
recurrence
T(A) < C () + Al/2+c) + T ($ + A'12+?)
? 1.59 A + o(A)
? 1.6A
If A ? 8 log2+? n, we apply Luby's algorithm directly to get an edge coloring with
2 A --H 1 ? 16 log2$? n Hence, the total number of colors to color any graph is at
most 1.6 A + 16 log2+? n, for any ? > 0, which is approximately 1.6 A + log2 n
colors.
The rest of the paper is devoted to establishing the key claim.
4.5 Probabilistic Analysis
This section is devoted to proving the key claim of the preceding section, namely
that given a graph G and A such that A > A(C) and A > THRESHOLD then,
85
after one iteration of Part (I) of the bipartite algorithm, the maximum degree of
the new graph, A(G1), is at most (1 t e)A/e w.h.p., for any fixed e> 0.
It turns out that the analysis is considerably easier for the top vertices than for
the bottom vertices. We begin with the easy part.
4.5.1 Analysis of Top Vertices
Let u be a generic top vertex with neighbors Vj? ? E [du], and incident edges
i = (u, vi). Recall that a loser is an edge which, after having got a tentative color
in the random proposal, lost the lottery and got decolored. So, the new degree of
u is given by the number of losers incident with it.
From the point of view of a top vertex, the random proposal and the lottery
are equivalent to the following random experiment. For each edge i incident on u
we introduce a ball i, and for each color k we introduce a bin k; the assignment of
a tentative color to an edge by the algorithm is equivalent to throwing each ball
into one of the A bins independently and uniformly at random, since the bottom
vertices assign tentative colors with uniform probability and independently of the
other bottom vertices. Recalling that we have at most A balls and exactly A bins:
?Thsers = ?balls --H ?winners
? ?bins --H ?nonempty bins
--H ?empty bins.
Let X be a random variable denoting the number of losers. To estimate X and
its tail distribution we will study the random variable B = ?emp1y bins. For this
purpose, we introduce A many indicator random variables
1 bin i is empty
Bj =
0 otherwise
and hence B --H ZiE[A] Bj. Notice that X ? B always. The variable B was
studied in Section 4.3 where it was shown that E[B] < A/e and that the Bj's
86
are 1-correlated, which implies that Pr(B > (1 + e)A/e) ? F(A/e, e). Since
E[X] < E[B] and Pr(X > (1 + e)A/e) ? Pr(B> (1 + e)A/e) we get
Theorem 15 Let u be a top vertex and X be the random variable denoting the
number of losers incident on it. Then, E[X] ? A/e and
Pr(X > (1 + e)A/e) ? F(A/e, e).
4.5.2 Analysis of Bottom Vertices
In this section we analyze what happens to the new degree of a generic bottom ver-
tex vB. This case is considerably harder to handle than the previous one, because
of the way in which the random variables describing the process are correlated. For
a top vertex, the dependency among the variables was playing for us; given that
some edges incident on a top vertex are losers, the probability of having another
loser decreases. For a bottom vertex the situation is reversed: having some edges
losing the lottery might even make the probability of having another loser increase.
The problem can be seen in Figure 4.1. Suppose we are given that el got tentative
color a and lost the lottery, and that e? got tentative color ?; we will argue intu-
itively that given this, the probability of e? losing the lottery has increased. Since
e? lost, the probability of e? getting tentative color a increases, which implies that
the probability of e? getting tentative color P also increases, and this increases the
probability of e2 losing the lottery.
For the sake of the analysis we modify the algorithm as follows: instead of per-
forming all random proposals in parallel, suppose that the bottom vertices perform
their random proposals sequentially, one after the other. This does not modify the
probability distributions because the choices are still done independently. We want
to focus our attention on the last vertex vB performing the random proposal. We
will use the fact that when vB performs its random proposal, all edges not inci-
dent on vB already have a tentative color. By symmetry, any upper bound on the
probabilities we can find for vB will hold for all bottom vertices.
87
TOP
e1			e3
BOTTOM
Figure 4.1: x(ei) c, x(e?) = ?, e? lost the lottery
Let i Ei [dv] denote an edge incident with the bottom vertex vB, with the other
endpoint of? being Uj. We introduce the indicator random variables
1 i loses the lottery
Xj =
0 otherwise
and want to study the expectation and tail probability distribution of the random
variable X = ZiE[dv] Xi. Computing the expectation is easy.
Lemma 8 E[X] ? A/e.
PROOF. Let vB be the bottom vertex. It is sufficient to show that Pr(Xj =
1) ? lIe, for all ? e [dv]. From the analysis of the top vertices, we know that
the expected number of losers incident with Uj is at most A/e and hence that
ZjE?(ui) Pr(j loses) ? A/e. By symmetry, Pr(j) < 1/e for all j E ?(ut), and
hence Pr(Xi = 1) ? l/e.
88
We now study the tail probability distribution of X. Our goal is to show that
X < (1 + ?)?/e w.h.p., for any fixed e> 0. Establishing this claim will take several
lemmas.
For technical reasons that will be clear later, we subdivide the edges incident
on VB into ? groups, each of at most ? edges. Let 0 be any such group and
define Xc = ?iE? X?; we would like to show that Xc is the sum of A--Hcorrelated
random variables, so that
V?> 0, Pr(Xc > (1 + ?)?/e) ? A F(?/e, e).
Rather than proving this claim, we will prove a weaker one namely that Xc is
the sum of A--Hcorrelated random variables with hi9h probability! More precisely, we
will define an event A such that A happens w.h.p., and such that, given that A
occurs, then Xc is the sum of A--Hcorrelated random variables. That is,
Pr(Xc > (1 + ?)?/e			A) < A F(?/e, e).
Showing this weaker claim is sufficient because
Pr(X > (1 + ?)?/e) < Pr(?Xc> (1 +
= Pr(30 Xc > (1+ ?)#A/e A) Pr(A) +
Pr(?C Xc > (1 + e)?/e I AC) Pr(Ac)
< Pr(30 Xc > (1 + ?)#A/e A) + Pr(Ac)
_ #A Pr(Xc > (1 + ?)#A/e I A) + Pr(Ac)
_ ? A F(?/e, ?) + Pr(Ac).
Another interesting point in the analysis is that A will not be a constant term but
an exponentially growing function. llowever we will be able to show the rate of
growth of A is much slower than that of l/F(?/e, ?). In other words, we will
show that A F(?/e, f) goes to zero superpolynomially fast.
89
We now turn to the task of showing that XG is the sum of A-correlated random
variables w.h.p. Recall from the definition that the Xi `s, z E G, are A-correlated if,
for all nonempty I C &
Pr AXi=1
iEI
Consider then a generic subset I c c of edges incident on the bottom vertex
vB, and let us see how to compute Pr (AjEl Xi = 1). Recall that we are analyzing
the situation where vB is the last vertex to perform its random proposal. This
means that prior to the assignment of a tentative color to edge? = (v, uj), all other
edges incident on Uj already have their tentative color. Using the balls-and-bins
language, we can say that prior to throwing ball i at random into one of the bins,
all balls coming from the other neighbors of Uj have been thrown. We will think
of i as a red ball and of the other edges at uj as white balls. Once the red ball
is thrown in, say, bin k E [?], a winner ?s selected uniformly at random among
all (i.e., red and white) balls in bin k. All other balls, the losers, are discarded.
Notice that the probability of discarding the red ball is itself a random variable
which depends on the particular placement of the white balls prior to throwing the
red ball.
Given any placement of white balls at Ui, we construct a vector of probabilities
Ci as follows. Let ?ik denote the number of white balls in bin k E [A] of vertex Ui,
and let Pik = aik/(l + ai?) denote the probability that the red ball loses the lottery
given that it was thrown in bin k (equivalently, Pik is the probability that edge
i loses, given that it got tentative color k). For each neighbor Ui of our bottom
vertex vB, we construct the corresponding vector Ci = (pii , ....... , PiA). Given
a set I of neighbors of vB we construct the matrix M1 whose i-th column is the
vector Ci. The next lemma explains why this matrix is relevant to us. From now
on, letp(m,k) = m(m--H 1)...(m--H k+ 1).
90
Lemma 9 Let I C C and k = IJ. Then,
Pr j)1Xi=1 = perm(Mi)
p(A,k) -
PROOF. The random proposal of vB is equivalent to choosing an one-to-one
function r : I H [?] uniformly at random among the set ? of all such functions.
Recall that the entry Mik of M1 is the probability Pik that edge i loses given that
it is given color k. llence,
Pr(Ai?iXi = 1) = ? Pr(Ai?iXi = 11 ris selected) Pr(ris selected)
rEP
=			Z (Ml,rU)			Mk,r(k)) (A1 A)1 o+... A--H1k+1)
rEP
--H			perm(Mj)
p(A,k)
E
We now want to ?nd a good upper bound of perm(Mi). The following lemma
gives a simple upper bound that is sufficient for our purposes. Given a matrix M
let Si denote the sum of the elements of Ci, the i-th column of M
Lemma 10 Let M = [Mik] be a matri? with c columns and r rows, and non--H
negative entries Mik. Then, perm(M) < HiE[c] 5i
PROOF. Let ? = r: [c] H [r], ir is one-to-one). Then,
perm(M) =
- fiSi.
iE[c]
Z Ml,r(1) M2,r(2) ... Mc,,r(c)
rEP
(M1,1 .... + Mi,r) (Th,i ?... +M2,r)... (Mc,i ?... + Mc,r)
91
f(p) =			k+kl
c
The next lemma relates the value of Si to that of 1/e > Pr(i loses), i E ?(v). It
is an application of the most general definition of A--Hcorrelation. Before the proof
of the lemma, we establish
m
r m--Hr
pq
Proposition 1 If 0 < p < 1, q = 1 --H p and rn is a positive integer, then
m			--H r _			(1--Hqm+l)
m pq			r+1 --H			p(m+1)
PROOF. Let
rn
r
r=1			r
Integrating both sides of the binomial expansion
between 0 and p, we get
m
k+1
1--Hqm+l
mtl
from which the proposition follows.			E
We now return to our scenario where VII is the last bottom vertex to pick tentative
colors for its edges, with its edges having been split into groups. Recall that we
are focusing on a particular group G, and that we want a good upper bound on
Pr (AjEl Xj = 1), for any nonempty I C 0. Lemma 9 gives the upper-bound
Pr (AjEl Xi = 1) ? perm(Mi)/p(A, Ill), and Lemma 10 gives the upper-bound
perm(M1) < llSi. So, a good upper-bound of S? will hopefully translate into a
good upper-bound for Pr (AiEi Xi = 1). The next lemma says that Si < A/e
92
w.h.p. Recall that we are analyzing the situation where Vfl is to give tentative
colors for its edges and that all other edges have already been assigned tentative
colors. If we focus on a particular edge i = (vB, uj) the situation is equivalent to
throwing a red ball i uniformly at random into A bins where each bin has some
number of white balls in it (possibly zero). Notice that the number of white balls
in each bin is itself a random variable.
Lemma 11 Let? = (v,ui) be any edge in ICC, andSi be the sum of the elements
of Ci, the i-th column of M1. Then, E[Si] ? A/e = ? and
Ve1 > 0, Pr(Sj > (1 + ei)ti) ?
PROOF. Let Zk be the random variable denoting the number of white balls in
bin k and Yk = Zk/(Zk + 1) be the random variable denoting the probability that
the red ball loses the lottery given that it ends in bin k. Then, Si = Y = ?k ?k
Note that the ?k'5 are bounded random variables with values in [0, 1]. We will
show that E[Y] < A/e and that the Yk'5 are 1--Hcorrelated (under the most general
definition of A--Hcorrelation), which will give our claim.
Firstly, we may assume that we have d = A --H 1 white balls (i.e., the degree of
Uj is A): Pr(Y > (1 + ei)A/e) is maximized at d = A --H 1, as d varies from 1 to
A --H 1. (To see this, assume d = A --H 1 --H k < A --H 1. Add k yellow balls to the
white balls, and run two experiments. In one experiment throw the white and red
balls and compute the probability that the red ball loses the lottery. In the other
experiment, throw white, yellow and red balls and again compute the probability
that the red ball loses. In both experiments, let us look at the bin where the red
ball fell. The probability that the red ball loses is b/(b+ 1) for the first experiment,
and (b + y)/(b + y + 1) for the second, where b and y are, respectively, the number
of white and yellow balls in the bin. Since y > 0, b/(b + 1) < (b + y)/(b + y + 1).
If Yi(d) indicates the variable Yj when Uj has degree d, then Yj(d) < Yi(A) for all
i ? [A] and d E [A].)
93
First, we will show that, for all i, E[Yj] < lIe and then we will show that, for
any set of k indices I C [?] and strictly positive integers Sj?
[ll ySi] ? liek			(4.3)
jEl
Given this we can apply Corollary 5 by introducing n independent twin 0-1
random variables Yj such that E[Yj] = Pr(Yj = 1) = lIe. Since the Yj's are binary,
equation 4.3 is the same as
E[fI ysi] < ? EKi] = fi EKi5?],
tEl			jEl			jEl
which is to say that the Yj's are 1--Hcorrelated. Noting that 0 < Yj < 1, it suffices
to show that
E[EYi] ? lie -
jEl			k			(4.4)
Without loss of generality we can assume I = [k]. We will prove inequality 4.4
by induction on k> 1 when k = 1,
E[Y1] =
Dl ?--Hl			___
0 r
= (1 --H lIA)? < l/e,
where the second equality follows from Proposition 1. Notice that for all j ? [A],
E[Yj] = E[Y1] < 1/e. When k> 1, the law of conditional probabilities gives
E[HYi]=E[YlY2.Yk?lE[Yk I YlY2--Yk-l]]
iE[k]
Suppose we show that for all non-zero Cj ? [0, 1] with i E [k --H 1],
k--Hl			1
?)? Yi=ci]?<?e;
(4.5)
E[YkI			(4.6)
94
t(r,a)			r
r+l
then, since the product ....... Yk--Hi in equation (4.5) is zero when any Cj is zero,
we see by induction on k that
A-i-a
A--Hi--Ha--Hr
pq
r
k
E[llYi]
i=i
- E[Y1Y2.. ?k-i E[Yk I Y1Y2?? Yk-1]]
e1E[\t'Yi]
i=i
_ 7k
Hence, the claim follows if we can show that inequality (4.6) holds.
If a? denotes the number of white balls that fell into bin z, then Cj = aj/(ai +1).
Let a = zk?i a'> k --H 1, p = 1/(A --H k + 1), and q = 1 --H p. Then
where
It is easy to check that t(r, a) > t(r, a + 1). As a consequence, the maximum
value of E[YkI Ak. il Yj = ci] is attained at a = k --H 1, in which case we have
A--Hi--Ha			A--Hk
? t(r,a) = Zt(r,k--H1)
r=i			r=i
Azk A-k
r
A-k+1 <
q _ lie,
prqA?k?r__r
r+l
k--Hi			k--Hi
E[YkIAYi=cil = E[YkIAZi=ai]
i=i			i=i
Aa
r
--H t(r,a),
95
by Proposition 1.
Putting all the pieces together (recall that k = I 11):
Pr(Ai?iXi = 1) --H
perin(Mi)
HiElSi
0
Ak			1
? (1+q)k ek p(A,k)
Ak			1
< (1tq)k p(A,k) ek'
for any el > 0. The first line follows from Lemma 9, the second from Lemma 10, and
the third from Lemma 11 and holds w.h.p. The last line is just a re-writing of the
third one. Given ei > 0, let A?1 be the event that, for all i e G, S? < (1 + ei)A/e.
Lemma 11 says that AE1 happens w.h.p., for any ?i > 0. The last line of the above
chain of inequalities means that, given that AE1 occurred, the Xj's are A-correlated
with A = (1 + ?i)kAk/p(A, k). (Alternatively, we can say that the Xj's are A--H
correlated w.h.p.) We now want an estimate of A. The next lemma explains why
we subdivided the edges incident on vB into groups.
Lemma 12 Forposilive integers I and k, tk/p(t,k) < ek2it. if k <1/2.
PROOF. We first note that ln(1 --H > --H2x, for 0 < x < 1/2. This is true,
since if we define f(x) ln(1 --H x) + 2x, then f(O) =0 and f'(x) --H \ffi?,which is
non-negative for 0 <x <1/2. Now,
P(tIk?k) --H kll?1(I?i)
t=1
= exp
ex
> exp			(since k <1/2)
96
k2
e t.
E]
For t = A and k = I < ? this lemma implies that Ak/p(A,k) < e and
hence, given Ac1, the Xj are A--Hcorrelated with A = (1 + ?i)ke < el+El#A, via the
standard approximation 1 + x < eZ. Though A goes to infinity superpolynomially
fast as A goes to infinity, we can still upper bound Pr(X? > (1 + e)A/e) by a
term that goes to zero superpolynomially fast, as follows, (Here, ? =
= Pr(X?>(l+?)? IA?1)Pr(A?1) +
Pr(Xc > (1 + ?)it I AfC1) Pr(A?C1)
? Pr(Xc > (1 + ?i I Aq) + Pr(A?C1)
? AF(;,? + Pr(?GX?>(1+q)?)
? el+El? F(?, ?) + ? F(jt, ?i)
? el+cl? e??2?13 + ? F(jt, ?i)
< + #& F(?, ?`).
Since ?i can be chosen arbitrarily small, we can make the above term go to zero
exponentially fast by choosing ?i such that ei --H ?2/3 < 0
Remark. We can now see why the parameter THRESHOLD must be ?(log2+6 n):
the above failure probability goes to zero superpolynomially fast as long as A =
?(log2+? n), for any fixed ?>0.
Hence, we have proven that XG < (1 + ?)?/e w.h.p., for any fixed e> 0. In
turn, this implies that the new degree of vB after one iteration of the algorithm is
at most (1 + ?)A/e w.h.p. To see this,
97
Pr(X> (1 + e)A/e) ? Pr(?G XG > (1 +
_ MA Pr(X? > (1+ e)MA/e)
MA (el+(ei--H?2/a)? + MAF(?, ci))
This inequality holds for any e and ci. llence the new degree of all the vertices is
at most (1 + e)A/e w.h.p. and we get
Theorem 16 The new degree of the graph after one iteration of Part (I) of the
algorithm is at most (1 + e)A/e w.h.p., for any fixed e > 0.
Bibliography
[AB186]
[AGllP92]
[AGLP89]
[AP90]
N. Alon, L. Babai, and A. Itai. A fast and simple randomized par-
allel algorithm for the maximal independent set problem. Journal of
Algorithms, 7:567--H583, 1986.
N. Alon, 0. Goldreich, J. Hastad, and R. Peralta. Simple constructions
of almost k--Hwise independent random variables. Random Structures
and Algorithms, 3(3):289--H303, 1992.
B. Awerbuch, A. V. Goldberg, M. Luby, and 5. A. Plotkin. Network
decomposition and locality in distributed computation. In Proceedings
of the IEEE Symposium on Foundations of Computer Science, pages
364--H369, 1989.
B. Awerbuch and D. Peleg. Sparse partitions. In Proceedings of the
IFEE Symposium on Foundations of Computer Science, pages 503--H513,
1990.
[AP92] B. Awerbuch and D. Peleg. Routing with polynomial communication-
space trade-off. SlAM J. on Discrete Mathematics, 5(2):151--H162, 1992.
[A5E92] N. Alon, J. Spencer, and P. Erdo's. The Probabilistic Method. Wiley--H
Interscience Series, John Wiley & Sons, Inc., New York, 1992.
[AweSS] B. Awerbuch. Complexity of network synchronization. J. Assoc. Cam-
put. Mach., 32:804--H823, 1985.
[BC] B. Berger and L. Cowen. Fast deterministic constructions of low-
diameter network decompositions. MIT-LCS Technical Memo #460,
April 1991
[B083]
[Bol79]
M. Ben-Or. Another advantage of free choice: Completely asyn-
chronous agreement protocols. In Proceedings of the ACM Symposium
on Principles of Distributed Computing, pages 27--H30,1983.
B. Bolloba's. Graph Theory. Springer Verlag, New York, 1979.
98
99
[BR91] B. Berger and J. Rompel. Simulating (logc n)-wise independence in
NC. J. Assoc. Comput. Mach., 38(4):1026--H1046, 1991.
[Bro41] R.L. Brooks. On colouring the nodes of a network. Proc. Cambridge
Phil. Soc., 37:194--H197, 1941.
[Che52]
[CS92]
H. Chernoff. A measure of asymptotic efficiency for tests of a hypothesis
based on the sum of observations. Annals of Mathematical Statistics,
23:493--H509,1952.
M. Choy and A.K. Singh. Efficient fault tolerant algorithms for re-
source allocation in distributed systems. In Proceedings of the ACM
Symposium on Theory of Computing, pages 593--H602, 1992.
[CV86] R. Cole and U. Vishkin. Deterministic coin tossing with applications to
optimal parallel list ranking. Information and Control, 70:32--H53,1986.
[ES73] P. Erd6s and J.L. Seifridge. On a combinatorial game. Journal of
Combinatorial Theory, Series A, 14:298--H301,1973.
[FLP85]
[GJ79j
M.J. Fischer, N. Lynch, and M. Paterson. Impossibility of distributed
consensus with one faulty process. Journal of the ACM, 32:374--H382,
1985.
M. R. Garey and D. 5. Johnson. Computers and Intractability: A Guide
to the Theory of NP-Completeness. W. II. Freeman and Company,
1979.
[GPS89] A. V. Goldberg, 5. A. Plotkin, and G. E. Shannon. Parallel symmetry-
breaking in sparse graphs. SIAM J. Disc. Math., 1:434--H446, 1989.
[GS] M. Goldberg and T. Spencer. A new parallel algorithm for the maximal
independent set problem. SIAM Journal of Computing, pages 419--H428.
[lioe63] W. Hoeffding. Probability inequalities for sums of bounded random
variables. American Statistical Association Journal, pages 13--H30, 1963.
[H590] P. llajnal and E. Szemere'di. Brooks coloring in parallel. SIAM J. Disc.
Math, 3(1):74--H80, 1990.
[1186] A. Israeli and A. Itai. A fast and simple randomized parallel algorithm
for maximal matching. Information Processing Letters, 22(2): 77--H80,
1986.
[JSWB92]
R. Jain, K. Somalwar, J. Werth, and J. C. Browne. Scheduling par-
allel i/o operations in multiple bus systems. Journal of Parallel and
Distributed Computing, 16(4):352--H362, 1992.
100
[Kar86] II. Karloff. A las vegas RNC algorithm for maximum matching. Corn-
binatorica, 6:187--H191,1986.
[Kar89] H. Karloff. An NC algorithm for Brooks' theorem. Theoretical Corn-
puter Science, 68:89--H103,1989.
[Kar91] R. M. Karp. Probabilistic recurrence relations. In Proceedings of the
ACM Symposium on Theory of Computing, pages 190--H197, 1991.
[KB87] H. Karloff and J. Boyar. Coloring planar graphs in parallel. Journal of
Algorithms, 8:470--H479,1987.
[KN88] M. Karchmer and J. Naor. A fast parallel algorithm to color a graph
with A colors. Journal of Algorithms, 9:83--H91,1988.
[KS87] II. J. Karloff and D. B. Shmoys. Efficient parallel algorithms for edge
coloring problems. Journal of Algorithms, 8:39--H52, 1987.
[KW85]
[Lei92]
R. M. Karp and A. Wigderson. A fast parallel algorithm for the max-
imal independent set problem. J. Assoc. Cornput. Mach., 32:762--H773,
1985.
T. Leighton. Methods for message routing in parallel machines. In
Proceedings of the A CM Symposium on Theory of Computing, pages
77--H96,1992.
[Lin92? N. Linial. Locality in distributed graph algorithms. SlAM J. Cornput.,
21(1):193--H201, 1992.
[LPV8l]
[LS9l]
G. F. Lev, N. Pippenger, and L. G. Valiant. A fast parallel algorithm for
routing in permutation networks. IEEE Transactions on Computers,
30:93--H100, 1981.
N. Linial and M. Saks. Decomposing graphs into regions of small di-
ameter. In Proceedings of the ACM/SIAM Symposium on Discrete Al-
gorithms, pages 320--H330, 1991.
[Lub86] M. Luby. A fast parallel algorithm for the maximal independent set
problem. SlAM J. Comput., 15(4):1036--H1053, 1986.
M. Luby. Removing randomness in parallel computation without a pro-
cessor penalty. In Proceedings of the IEEE Symposium on Foundations
of Computer Science, pages 162--H173,1988. To appear in a special issue
of Journal of Computer and System Sciences, devoted to FOCS 1988.
[Lub88]
101
[Lyn8l]
N. A. Lynch.
tributed system.
278,1981.
Upper bounds for static resource allocation in a dis-
Journal of Computer and System Sciences, 23:254--H
[MNN89] R. Motwani, J. Naor, and M. Naor. The probabilistic method yields de-
terministic parallel algorithms. In Proceedings of the IEEE Symposium
on Foundations of Computer Science, pages 8--H13, 1989.
[Nao91a] J. Naor. A fast parallel coloring of planar graphs with five colors. SlAM
Journal Disc. Math., 4(3):409--H412, 1991.
[Nao91b] M. Naor. A lower bound on probabilistic algorithms for distributed
ring coloring. SlAM Journal Disc. Math., 4(3):409--H412, 1991.
[NN9O] J. Naor and M. Naor. Small--Hbias probability spaces: efficient con-
structions and applications. In Proceedings of the ACM Symposium on
Theory of Computing, pages 213--H223, 1990.
[NS92j
[Pan]
[PS92a]
[PS92b]
[PS92c]
[Rag90]
[Ram9O]
M. Naor and L. Stockmeyer. What can be computed locally? Unpub-
lished manuscript, to appear in STOC 93, 1992.
A. Panconesi. Unpublished manuscript.
A. Panconesi and A. Srinivasan. Fast randomized algorithms for dis-
tributed edge coloring. In Proceedings of the A CM Symposium on Prin-
ciples of Distributed Computing, pages 251--H262,1992.
A. Panconesi and A. Srinivasan. Improved distributed algorithms for
coloring and network decomposition problems. In Proceedings of the
ACM Symposium on Theory of Computing, pages 581--H592, 1992.
A. Panconesi and A. Srinivasan. The local nature of ?--Hcolorings and
its algorithmic applications. Submitted for publication to COMBINA-
TORICA, 1992.
P. Raghavan. Lecture notes on randomized algorithms. Technical Re-
port RC 15340 (#68237), IBM T.J.Watson Research Center, January
1990.
R. Raman. The power of Collision: Randomized parallel algorithms for
chaining and integer sorting. In Proceedings, 10th Annual FST & TCS
Conference, Lecture Notes in Computer Science # 472, pages 161--H175.
Springer-Verlag, Berlin, December 1990.
[Spe87] J. Spencer. Ten Lectures on the Probabilistic Method. SlAM, Philadel-
phia, 1987.
102
[Sri]
[Sri93]
A. Srinivasan. Personal Communication.
A. Srinivasan. Techniques for probabilistic analysis and randomness-
efficient computation. Ph.D. dissertation, Cornell University, August
1993.
E. Styer and G.L. Peterson. Improved algorithms for distributed re-
source allocation. In Proceedings of the ACM Symposium on Principles
of Distributed Computing, pages 105--H116, 1988.
[SP88]
