BIB-VERSION:: CS-TR-v2.0
ID:: CORNELLCS//TR92-1272
ENTRY:: 1993-10-14
ORGANIZATION:: Cornell University, Computer Science Department
LANGUAGE:: English
TITLE:: Fault-Tolerant Broadcasts and Multicasts: The Problem of 
        Inconsistency and Contamination
AUTHOR:: Gopal, Ajei Sarat  
DATE:: March 1992
PAGES:: 150
COPYRIGHT:: Ajei Sarat Gopal 1992 - All Rights Reserved
ABSTRACT::
An increasingly important paradigm for designing fault-tolerant applications 
for distributed systems is based on processes that communicate exclusively 
via fault-tolerant broadcasts and multicasts. Most broadcasts that are 
described in the literature, such as reliable broadcast, causal broadcast, 
atomic broadcast and the corresponding multicasts, specify the behavior of 
correct processes, but do not impose requirements on the behavior of faulty 
processes. Such specifications allow a process that fails during a broadcast 
to reach an ``inconsistent'' state (e.g., by omitting the delivery of a 
message), and to continue execution from that state. This faulty process may 
later broadcast messages that ``contaminate'' the correct processes.

In this thesis, we argue that such inconsistency and contamination can 
complicate the design of applications, and we present fault-tolerant 
broadcast and multicast protocols that prevent inconsistency and contamination.

We begin by formally defining a hierarchy of different types of process 
inconsistency; these definitions are general, and hence are valid for any 
broadcast specification. Intuitively, contamination is the ``spread'' of 
inconsistency from faulty processes to correct processes. We formalize this 
concept and show that only two forms of contamination arise from our 
hierarchy of types of inconsistency.

Atomic broadcast and atomic multicast are powerful communication abstractions 
that are central to many systems (e.g., Isis, and IBM's HAS), and to 
Lamport's state machine approach to fault-tolerance. Using our general 
definitions of inconsistency and contamination, we derive necessary and 
sufficient conditions to prevent inconsistency and/or contamination when 
processes communicate using atomic broadcast. We also derive similar 
conditions for atomic multicast. Based on these conditions, we develop atomic 
broadcast protocols that prevent inconsistency and/or contamination.

In general, the prevention of inconsistency is a stronger requirement (and 
more difficult and more expensive to enforce) than the prevention of 
contamination. We characterize a class of problems for which the prevention 
of contamination is as good as the prevention of inconsistency. We show that 
an application that solves a problem in this class under the simplifying 
assumption that both inconsistency and contamination are prevented remains 
correct even if it uses a (less expensive) broadcast protocol that only 
prevents contamination.
END:: CORNELLCS//TR92-1272
BODY::
Fault-Tolerant Broadcasts and Multicasts:
The Problem of Inconsistency and
Contamination
Ajel Sarat Gopal
Ph.D Thesis
92-1272
March 1992
Department of Computer Science
Cornell University
Ithaca, NY 14853-7501
iAUIT-TOLER?ANT BR?OADCASTS AND MULTICASTS:
THE PR?OBLEM OF INCONSISTENCY AND
CONTAMINATION
A Dissertation
Presented to the Faculty of the Graduate School
of Cornell University
in Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy
by
Ajei Sarat Gopal
January 1992
Qc Ajei Sarat Gopal 1992
ALL RIGHTS RESERVED
FAULT-TOLERANT BROADCASTS AND MULTICASTS:
THE PROBLEM OF INCONSISTENCY AND CONTAMINATION
Ajei Sarat Gopal, Ph.D.
Cornell University 1992
An increasingly important paradigm for designing fault-tolerant applications for
distributed systems is based on processes that communicate exclusively via fault-
tolerant broadcasts and multicasts. Most broadcasts that are described in the lit-
erature, such as reliable broadcast, causal broadcast, atomic broadcast and the cor-
responding multicasts, specify the behavior of correct processes, but do not impose
requirements on the behavior of faulty processes. Such specifications allow a process
that fails during a broadcast to reach an "inconsistent" state (e.g., by omitting the
delivery of a message), and to continue execution from that state. This faulty process
may later broadcast messages that "contaminate" the correct processes.
In this thesis, we argue that such inconsistency and contamination can complicate
the design of applications, and we present fault-tolerant broadcast and multicast
protocols that prevent inconsistency and contamination.
We begin by formally defining a hierarchy of different types of process inconsis-
tency; these definitions are general, and hence are valid for any broadcast specifica-
tion. Intuitively, contamination is the "spread" of inconsistency from faulty processes
to correct processes. We formalize this concept, and show that only two forms of
contamination arise from our hierarchy of types of inconsistency.
Atomic broadcast and atomic multicast are powerful communication abstractions
that are central to many systems (e.g., Isis, and IBM's HAS), and to Lamport's state
machine approach to fault-tolerance. Using our general definitions of inconsistency
and contamination, we derive necessary and sufficient conditions to prevent inconsis-
tency and/or contamination when processes communicate using atomic broadcast.
We also derive similar conditions for atomic multicast. Based on these conditions,
we develop atomic broadcast protocols and atomic multicast protocols that prevent
inconsistency andi or contamination.
In general, the prevention of inconsistency is a stronger requirement (and more
difficult and more expensive to enforce) than the prevention of contamination. We
characterize a class of problems for which the prevention of contamination is as
good as the prevention of inconsistency. We show that an application that solves
a problem in this class under the simplifying assumption that both inconsistency
and contamination are prevented, remains correct even if it uses a (less expensive)
broadcast protocol that only prevents contamination.
Biographical Sketch
Ajei Gopal was born in Madras, India. He attended several schools on his way to
receiving a Doctorate from Cornell University--HMount St. Mary's in New Delhi,
Windsor Grammar School in Windsor, England, the Indian Institute of Technology
in Bombay, and the University of Arizona in Tucson. He has worked for Bellcore,
and currently works for IBM.
His most notable achievement is jumping out of a plane with a malfunctioning
parachute. .. and surviving!
iii
To my family, especially to Mom and Dad.
Acknowledgements
I am deeply grateful to Sam Toueg, my friend and advisor, who has taught me to
"do research" and to "write research." It has been a pleasure and a privilege to work
with Sam, and this thesis is a result of our collaboration. I could not have asked for
a better advisor.
Many thanks to Keith Marzullo and Joe Mitchell for serving on my committee.
Thanks to David Shmoys for proxying for Joe at my defense.
Many people helped me in my stay at Cornell, and in my research. I thank my
colleagues at Cornell and IBM who have influenced my thesis work, in particular
Ken Birman, Navin Budhiraja, Tushar Chandra, Vassos Hadzilacos, Keith Marzullo,
Gil Neiger, Prakash Panangaden, Fred Schneider, Pat Stephenson, Ray Strong and
Mark Wood. I thank all my friends at Cornell, in particular Navin, Tushar and their
housemates, and Samir Khuller and his housemates for providing me with a place to
sleep when it became too late to go home, and my office-mates Steve Mitchell and
Mark Wood for making our office such a pleasant place. I also thank Jan Batzer,
Debbie Smith and the rest of the staff of the Cornell CS Department.
Thanks to IBM for awarding me a Fellowship that paid for the last two years of
my PhD, and for allowing me to complete this thesis after coming on board. Thanks
also to the NSF for indirectly funding me through Sam's grants.
I am very grateful to Gideon Frieder, Dean of the CIS Department at Syracuse
University, for permitting me to work at SU; this saved me many, many miles of
driving. Thanks also to Anand Rangachari for the coffees and the lunches.
Finally, I would like to thank all the members of my family. Minni's parents,
Leela and Babu, for treating me like a son, and for their love and support. Oogles
for rolling over four times in one day! Aasha for being the sister I always wanted.
Inder, my first teacher and advisor, for always being there for me. Mom and Dad,
for their love, understanding, confidence and constant encouragement; it is to them
I owe everything. Minni for making it all worthwhile, now and forever.
vii
Table of Contents
2
3
Introduction
1.1			Specifications of fault-tolerant broadcasts: Inconsistency and			contam-
ination. . . .			.			4
1.2			Implementations of fault-tolerant broadcasts: The			consequences of
layering. .			6
1.3			Organization of the thesis .			. .			. . . . . . .			. . .			8
An			Execution Model for Application			Protocols
2.1			Formal model . .			. .			. 						. 			. .			. .			.
2.1.1			Process state			. .			. . 						.
2.1.2			Broadcasting and			delivering			messages			. 			. .			. .			.
2.1.3			Application protocols			. 			.
2.1.4			Premature halting .			. 			. 			. 			. .			.			. .
2.1.5			Execution of an application			protocol
2.1.6			History functions			.			. . 			. 						. .			.			.			. .
2.1.7			Histories . . .			.			. 			. .
2.1.8			Failure sets .			.			. 			. .
2.1.9			Notation . . . .			. . 			. 						. .			.
2.2			Broadcast specifications						. 			. .			. .			.
2.3			Modeling executions . . .			. .			.
2.4			Unsuccessful broadcasts
2.5			Successful deliveries . 						. 			.			. .			.			. .
2.6			Causal precedence with
2.7			Application specifications			. 			.			. .			.			. .
broadcasts
Defining Inconsistency and Contamination
3.1			Crash behavior			. .			. . .			.
3.1.1			Crash behavior in a broadcast history . .			.
3.1.2 Crash consistent behavior in an application history
3.1.3			Solving problems assuming crash behavior .			.
3.2 Correctness conditions: Restrictions on faulty processes
3.2.1			An anomaly with crash behavior			. . .			.
ix
10
10
10
10
11
11
12
13
13
14
14
15
16
17
17
17
19
20
21
21
23
24
27
27
3.3
3.4
3.5
3.6
3.2.2			Delivery correctness
3.2.3 Visible-broadcast/delivery correctness . . .
3.2.4			Broadcast/delivery correctness
3.2.5			X-correctness. . .			. . .
The correctness hierarchy
3.3.1 The independence and fifo-independence properties
3.3.2 Hierarchy theorems
Process consistency: Restrictions on faulty processes
Process contamination: Restrictions on correct processes
3.5.1 Contamination in an application history .
Consistent and contamination-free histories
Solving Correct Restricted Problems
4.1			The choice property of broadcast specifications .
4.2			The SD-closed property
4.3			The maximal causal prefix . . . .
4.3.1			Notation
4.3.2			Defining the maximal causal prefix
4.4			The history hierarchy . . .
4.5			Correct restricted problems . . . . . . 			.
Inconsistency and Contamination witb Atomic Broadcast
5.1			Atomic broadcast			. . .			. .			. . .
5.1.1			Notation
5.1.2			Consistency with atomic broadcast			. .			. .			.
5.1.3			Contamination with atomic			broadcast			.			.
5.2			Timed			atomic broadcasts . . .			. . . .			.			. .			. .
5.3			The independence property			.			.			.
5.4			The choice property			. . . .			.			. .			. .
4
5
6
Extending the Formal Model
6.1			Extended model . . . . . .			. .			. .			. . .			. . .
6.1.1			Process state
6.1.2			Sending and receive			messages			. .			. . . .			. . .
6.1.3			Broadcast protocols
6.1.4			Premature halting . .			. .			. .
6.1.5 Execution of an application protocol using a broadcast protocol
6.2			Point-to-point message passing			. . .			. . .
6.3			Extended histories
6.4			Systems and process failures .			. . .			. . . .			. . .
6.5			Consistency and contamination in the extended model
6.6 Broadcast protocols implementing a broadcast specification . .
29
31
33
34
35
36
38
40
40
42
42
43
44
45
51
53
53
56
57
59
59
60
60
68
70
71
74
75
76
76
76
77
77
78
79
80
81
82
84
84
7
8
A
6.7 Solving a problem
Atomic Broadcast Protocols
7.1			Preliminaries . . .
7.1.1			Communication . . .
7.1.2			Overview of results
7.2			Atomic broadcast.
7.3			Atomic broadcast with no contamination
7.4			Atomic broadcast with no inconsistency
7.4.1			Simple protocol: latency A + A
7.4.2			Optimal protocol: latency A
7.4.3			Ensuring BD-consistency
Atomic Multicast
8.1			Preliminaries . .			. . .
8.2 Inconsistency and contamination with respect to pairwise atomic mul
ticast
8.3			Overview of the pairwise atomic multicast			protocols
8.4			Atomic			multicast. .
8.5			Pairwise atomic multicast with no contamination			.			. .
8.5.1			Protocol 1 . . . . .			. . .
8.5.2			Protocol 2 . . . . .			. . . . .			. .			. .
8.5.3			Lower bounds . . . .			.
8.6			Pairwise atomic multicast with no inconsistency			. .			.
Broadcasts
Other Fault-tolerant
A.1 Reliable broadcast
A.2 Causal broadcast . .
A.3 Causal atomic broadcast
Bibliography
86
86
87
88
89
90
93
95
99
112
114
114
117
120
122
122
124
128
139
142
144
144
145
146
148
List of Figures
1.1
1.2
1.3
1.4
1.5
Executing an application protocol			. .
Executing an application protocol			in most systems .
Example of processes communicating via atomic broadcast
Example of inconsistency with atomic broadcast
Example of contamination with atomic broadcast
3
4
5
6
2.1			The execution of II by process p			. . .			. . .			. .
Illustrating the definition of crash behavior			. .
A segment of p's history			.			.			. .			. .
Protocol that uses timed reliable broadcast with latency ?
Portion of broadcast history B			.			.			.			. . .			. .
Portion of broadcast history			B'
Broadcast histories B and ??			. .			. .			.			. .			. .
The correctness hierarchy . .			. .
The contamination hierarchy			. .			.			.			. .			.
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
4.1
4.2
4.3
Illustrating the proof of Theorem 4.5
Illustrating the maximal causal prefix
The history hierarchy
6.1 The execution of fi by process p
7.1
7.2
7.3
7.4
7.5
8.1
8.2
8.3
8.4
8.5
that prevents BD-contamination
that prevents VBD-inconsistency
Atomic broadcast protocol			. 			. .
Atomic broadcast protocol
Atomic broadcast protocol
Witness broadcast protocol			. . 			. .
Atomic broadcast protocol that prevents VBD-inconsistency
Message delivery according to local atomic multicasts. . . .
Message delivery cycle in pairwise atomic multicast
Pairwise atomic multicast. . . .			. . .			.
Pairwise atomic multicast protocol .			.			.
Pairwise atomic multicast protocol that prevents contamination
12
22
23
25
28
29
32
35
41
49
51
56
78
89
91
96
101
107
116
117
120
122
126
xii
8.6
8.7
8.8
8.9
Strong rnulticast protocol			. . .			.
Pairwise atomic multicast protocol that prevents contamination .
Illustration of the lower bound on fault tolerance
Pairwise atomic multicast protocol that prevents inconsistency . .
130
135
140
143
Chapter 1
I'
r			---			--H?
Introduction
An increasingly important paradigm for designing fault-tolerant applications for
distributed systems is based on processes that communicate exclusively via fault-
tolerant broadcasts and multicasts. Such a paradigm, illustrated in Figure 1.1, is
supported by the IBM HAS project [Cri87], Isis [BJ87j, the Lazy Replication scheme
[LLS9O] and Psync [PBS89].
p
Application protocol
broadcast(m)
f
q
Application protocol
deliver(m)
Broadcast
Delivery
Interface
Communications subsystem
Figure 1.1: Executing an application protocol
In systems where processes are subject to arbitrary failures (i.e., malicious or
Byzantine failures, [LSP82]), a faulty process can arbitrarily change state to an
"inconsistent" (i.e., erroneous) state. This faulty process may later "spread" its
inconsistency to the rest of the processes by broadcasting a message that reflects
its erroneous state. By delivering this message and changing state accordingly, the
2
correct processes incorporate the inconsistency of the faulty process into their own
state, thereby becoming "contaminated." Clearly, having to tolerate such inconsis-
tency and contamination complicates the design of fault-tolerant applications.
In this thesis, we concentrate on systems where processes are subject to "benign"
failures, such as crash failures, omission failures [Had84,PT84] and timing failures
[CASD85]. In such systems, faulty processes do not arbitrarily change state. Thus,
the current state of every process is a function of the application being executed, the
process's initial state and all the messages that the process has delivered. Further-
more, every process uses its current state to determine the messages that it must
broadcast next.
Surprisingly, even in systems subject to benign failures, faulty processes may be-
come inconsistent, and subsequently contaminate the correct processes. Consider
the specifications of fault-tolerant broadcasts. Most broadcasts that are described in
the literature specify the behavior of correct processes, but do not impose require-
ments on the behavior of faulty processes. Such specifications allow a process that
fails during a broadcast to reach an inconsistent'' state (for example by omitting the
delivery of a message), and to continue execution from that state. This faulty process
may later broadcast messages that "contaminate" the correct processes. Surprisingly,
the usual specifications of reliable broadcast [LSP82], caus& broadcast [BJ87,PBS89],
atomic broadcast [Lam84,BJ87,CASD85] (and of the corresponding multicasts) that
can be found in the literature permit both inconsistency and contamination.
Although the occurrence of inconsistency or contamination is not explicitly pre-
cluded by the specifications of most fault-tolerant broadcasts, can inconsistency and
contamination actually occur in systems subject to benign failures, such as crash
failures? As we show below, this is indeed the case. Hence, applications that are
based on such broadcasts must be designed to overcome the problems associated
with inconsistency and contamination, even in systems subject to benign failures.
Consider the implementations of fault-tolerant broadcasts. Such broadcasts are
usually implemented by broadcast protocols. Thus, when a process executes an ap-
plication protocol that is based on a fault-tolerant broadcast, that process must
also execute a corresponding broadcast protocol. In Figure 1.2, we illustrate such
a "layered" structure for a system where the broadcast protocol uses send/receive
message-passing primitives across point-to-point communications links.
With such a structure, a benign failure at the send/receive interface (between
Tin general, broadcast specifications that restrict the behavior of faulty processes
cannot be implemented in systems where processes may fail arbitrarily. However,
in the systems we consider in this thesis, it is possible to enforce restrictions on the
behavior of faulty processes.
3
p
Application protocol
broadcast(m)
q
Application protocol
deliver(m)
t
Broadcast protocol			Broadcast protocol
send(m)			receive(m)
I'
Broadcast
Delivery
Interface
Send
Recei
Interface
Communications network
Figure 1.2: Executing an application protocol--Hin most systems
the broadcast protocol and the communications network) may not result in the same
type of failure at the broadcast/delivery interface (between the application protocol
and the broadcast protocol). For example, an omission to receive a message does not
necessarily result in a corresponding omission to deliver a message. This undesirable
behavior will be illustrated with atomic broadcast. Informally, atomic broadcast re-
quires that all correct processes deliver the same messages in the same order. We will
show that a well-known implementation of atomic broadcast can exacerbate the effect
of a process's failure to receive a message, by causing that process to deliver messages
out-of-order. Furthermore, we will describe another atomic broadcast protocol that
allows the following non-intuitive behavior: a process may behave correctly until it
crashes, and still deliver messages out-of-order. This surprising observation seems to
contradict the definition of a crash failure.
Thus, we conclude that in general, inconsistency and contamination are possible
even in systems subject to benign failures.
In summary, two problems may complicate the design of application protocols
that rely on fault-tolerant broadcasts:
o+ The specifications of most fault-tolerant broadcasts allow faulty processes to
become inconsistent and to contaminate correct processes.
4
Some implementations of fault-tolerant broadcasts convert "benign" process
failures at the send/receive interface into more "severe" failures at the broad-
cast/delivery interface.
In this thesis, we first give a general definition of process inconsistency and con-
tamination; i. e., we define these concepts with respect to any broadcast specification.
We then use this general definition to derive specific definitions of inconsistency and
contamination with respect to atomic broadcast and to atomic multicast. Finally,
we show how to prevent inconsistency and/or contamination with respect to atomic
broadcast and atomic multicast.
1.1 Specifications of fault-tolerant broadcasts:
Inconsistency and contamination
Most broadcasts that are described in the literature specify the behavior of correct
processes, but do not impose requirements on the behavior of faulty processes. Such
specifications allow a process that fails during a broadcast to reach an "inconsistent"
state, and to continue execution from that state. This faulty process may later
"spread" its inconsistency to the rest of the processes by broadcasting a message
that reflects its erroneous state. Such a message is "corrupted," and by delivering
this message and changing state accordingly, the correct processes incorporate the
inconsistency of the faulty process into their own state. Thus, the correct processes
become "contaminated."
p
q
broadcast(x := x + 1)			delive?x			x + 1) delive?x			2x)
=			= 12?
=			=			= 12?
broadcast(x			2x)			delive?x			x + 1)			delive?x			2x)
Figure 1.3: Example of processes communicating via atomic broadcast
We use atomic broadcast as an example to illustrate inconsistency and contam-
ination. Informally, atomic broadcast requires that all correct processes deliver the
5
same messages in the same order. Suppose a variable x is replicated at two correct
processes p and q. Suppose that p broadcasts an instruction to increment x, and
q broadcasts an instruction to double x (Figure 1.3). Atomic broadcast guarantees
that these instructions are delivered in the same order everywhere; hence, all copies
of x remain equal.
However, atomic broadcast does not specify the behavior of faulty processes.
Thus, a faulty process may reach an inconsistent state in several ways: for example,
by omitting to deliver a message m that is delivered by the correct processes, or by
delivering an extra message m that is not delivered by the correct processes, or by
delivering messages out-of-order. 2
p
q
r
Faulty
broadcast(x := x + 1)			deliver(x := x + 1) delive'(x			2x)
= 12?
= 12?
broadcast(x := 2x)			del?ver(x := x + 1)			del?ver(x := 2x)
r becomes inconsistent			= 1O?
delive'(x := 2x)
Figure 1.4: Example of inconsistency with atomic broadcast
An example of inconsistency with atomic broadcast is shown in Figure 1.4. The
variable x is replicated at processes p, q and r. As in the previous example, p and
q broadcast instructions to update x. However, r is faulty, and fails to deliver p's
instruction to increment x, but delivers q's instruction to double x. By skipping the
increment instruction, r becomes inconsistent and computes an incorrect value of x.
Note that since r is faulty, and the specification of atomic broadcast only restricts the
behavior of correct processes, the execution shown in Figure 1.4 does indeed satisfy
the specification of atomic broadcast.
Once r is inconsistent, it can broadcast messages that are based on its inconsistent
state and thus contaminate the correct processes. An example is shown in Figure 1.5.
Process r uses its value of x to compute and broadcast the value of the replicated
2Chapter 5 formally describes the possible sources of inconsistency with respect
to atomic broadcast.
6
1.2 Implementations of fault-tolerant
broadcasts: The consequences of layering
Fault-tolerant broadcasts are usually implemented by a broadcastprotocolthat uses
lower-level communications primitives, such as point-to-point message sendsand re-
ceives (Figure 1.2). Often, the broadcastand/or deliveryof a message by a broadcast
protocol requires the execution of many instructions, including several sends and re-
ceives. When a process p attempts to broadcast a message, a "thread" of execution is
"forked" within the broadcast protocol layer. This thread executes (or appears to ex-
ecute) concurrently with p's application protocol, and the broadcast is not complete
until the thread terminates.
Note that r becomes inconsistent by committing a "benign" failure--Hskipping
the delivery of a single message. However, as a result of this undetected failure, r
is allowed to subsequently broadcast the erroneous message y 30 instead of the
correct y 36. At this point, r appears to commit a "Byzantine-like" failure, even
though it actually fails benignly by omission. This clearly complicates the design
of application protocols that rely on these broadcasts.
variable II, which is supposed to be 3x. Since r is inconsistent and has incorrectly
computed x to be 10, r broadcasts y 30 instead of the correct y 36. When p
and q deliver the message y 30 and update their copies of y to be 30, they become
contaminated.
Figure 1.5: Example of contamination with atomic broadcast
y := 3x
= 30)
broadcast(y := 30)			deliver(y			30)
p
q
r
Faulty
= 12)
= 12)
deliver(y := 30)
p become?contaminated			(y			30)
q becomes contaminated			(y			30)
deliver(y := 30)
= 10)
7
With such a layered structure, the failure of a process at the send/receive interface
(between the broadcast protocol and the communications network) may not result in
the same type of failure at the broadcast/delivery interface (between the application
protocol and the broadcast protocol). We first show this undesirable behavior with
Skeen's atomic broadcast protocol [BJ87]. With this protocol, a faulty process that
fails to receive a point-to-point message (at the send/receive interface) may deliver
messages out-of?order (at the broadcast/delivery interface). This broadcast protocol
is sketched below.
When a process intends to broadcast a message m, it sends m to all processes.
When a process receives rn, it sends a tentative seqtence number for m to all pro-
cesses. Every process chooses m`s final sequence number to be the m?imum of all
the tentative sequence numbers for m that it receives. Processes deliver messages in
order of increasing final sequence numbers. Thus, if a faulty process fails to receive
one of the tentative sequence numbers for rn, it computes an incorrect final order
number for m, and can deliver m out-of-order.
A coordinator-based atomic broadcast protocol illustrates a more surprising be-
havior: even if a faulty process behaves correctly until it crashes, it may still deliver
messages out-of-order before it crashes. The protocol is as follows. When a process
intends to broadcast a message m, it first sends m to a coordinaton The coordinator
delivers messages in the order in which it receives them, and periodically informs
the other processes of this message delivery order. Other processes deliver messages
according to this order. When the coordinator crashes, another process takes over
as coordinator.
Suppose a coordinator delivers m before m' and then crashes before informing
any other process that m should be delivered before m'. The new coordinator cannot
determine the order chosen by the faulty coordinator, and may decide that m1 should
be delivered before m. In this scenario, the faulty coordinator delivered messages
out-of-order before it crashed.
The above example shows that a faulty process may become inconsistent be-
fore crashing, even though it executes correctly until it crashes. Furthermore, from
the time that such a process becomes inconsistent to the time that it crashes, it
may broadcast messages and thus contaminate the correct processes. Thus, even if
processes can only fail by crashing, inconsistency and contamination can occur.
3blearl - the prevention of inconsistency and contamination is much easier with
crash failures, than with omission or timing failures.
8
1.3 Organization of the thesis
In Chapter 2, we define a formal model of execution for an application protocol that
is based on processes that communicate exclusively via a fault-tolerant broadcast.
Informally, such a protocol relies upon the specification of this broadcast, and not
on additional properties that may be provided by some particular impiementation of
the broadcast.
In Chapter 3, we give a general definition of inconsistency that applies to any
fault-tolerant broadcast. In fact, we identify a hierarchy of three levels of process
"consistency," and each one results in a different severity of inconsistency. In this
chapter we also define the concept of contamination.
In Chapter 4, we characterize a class of problems for which the prevention of
contamination is "as good" as the prevention of inconsistency. To solve a problem in
this class, an application protocol can be designed with the simplifying assumption
that it will use a broadcast protocol that prevents both inconsistency and contam-
ination. The application protocol remains correct even if it uses a (less expensive)
broadcast protocol that only prevents contamination.
In Chapter 5, we use our general definitions of inconsistency and contamination
to derive specific definitions for the case of atomic broadcast.
In Chapter 6, we extend our model to include the execution of an underlying
broadcast protocol. We define when a broadcast protocol implements a broadcast
specification in a particular system. We also formally define when a process falls
by omission, the failure mode tolerated by the broadcast protocols presented in
Chapters 7 and 8.
In Chapter 7, we show how inconsistency and contamination can be prevented
by focusing on atomic broadcast. We describe an atomic broadcast protocol that
prevents contamination in systems where processes may fail by crashing, or by inter-
mittently omitting to send or receive messages. This protocol has optimal message
delivery time, and it only requires a small increase in message size over known atomic
broadcast protocols that do not prevent contamination. F'urthermore, this protocol
does not require an upper bound on the number of processes that may fall. In Chap-
ter 7, we also describe an atomic broadcast protocol that prevents inconsistency.
However, this protocol requires that a majority of processes in the system remain
correct. We show that this requirement is necessary, and hence the prevention of
inconsistency is intrinsically harder than the prevention of contamination.
In Chapter 8, we concentrate on atomic mttticast. Informally, an atomic mul-
ticast is an atomic broadcast in which messages are targeted to specific subsets of
processes. We begin by defining a hierarchy of three natural types of atomic mul-
9
ticast. We then concentrate on pa?ru'tse atom?c multicast, and define inconsistency
and contamination with respect to such a multicast. We also present a pairwise
atomic multicast protocol that prevent inconsistency and pairwise atomic multicast
protocols that prevent contamination.
Finally, in Appendix A, we define inconsistency and contamination with reliable
broadcast, causal broadcast and causal atomic broadcast.
Chapter 2
An Execution Model for
Application Protocols
This chapter defines a formal model of execution for application protocols, which
are protocols that assume that processes communicate exclusively via fault-tolerant
broadcasts. The model introduced in this chapter is used to formally define incon-
sistency and contamination with respect to the specification of any fault-tolerant
broadcast.
2.1 Formal model
Let ? denote the set of processes in a distributed system. The processes communicate
by broadcasting messages. They execute an application protocol which assumes that
the processes communicate exclusively via fault-tolerant broadcasts.
2.1.1 Process state
Each process maintains a local clock taken from the set of positive integers, denoted
I. Each process has an application state (sometimes abbreviated to state) taken from
the set Q. The set of states includes the special state I, called the premature halt
state.
2.1.2 Broadcasting and delivering messages
The processes communicate via a primitive to broadcast messages, and a primitive
to deliver messages.
10
11
A message is a tuple of the form m = (p, c, data), where p is the broadcaster
of the message, c (the timestamp, denoted ts(m)) is the time on p's clock at which
m was broadcast, and data is the information that p wishes to broadcast. If p is
the broadcaster of a message m, we write p = bc(m). Let M denote the set of all
messages, and let M+ denote the set of all sequences of messages.
If a process p invokes broadcast (m) at clock time c, we say that p broadcasts m
at c. If the result of process p invoking the deliver primitive at clock time c is the
sequence of messages D = (mi. .mk?, then for each message m in D, we say that
p delivers m at c. We write "p broadcasts (delivers) ? at c," to mean that p did not
broadcast (deliver) any message at c.
2.1.3 Application protocols
Processes execute an application protocol (sometimes abbreviated to protocol) II.
Informally, II specifies the messages to be broadcast by the processes, and the state
transitions made by the processes. A protocol II consists of two functions: the
messa?e function, denoted 11m, and the state transition function, denoted 11r
The message function determines the message to be broadcast next; formally,
Tim : ? x Ix Q x M+ MU?. If at time c, p is in state 5 and delivers the
sequence of messages D, then p should broadcast message 11m(p, c, 5,
The state transition function determines the next state of a process; formally,
11r : x I x Q x H Q. If at time c, p is in state s and delivers the sequence
of messages D, then p should enter state Tir(p, c, 5,
2.1.4 Premature halting
Informally, a process may prematurely halt execution at any time; when it halts, it
makes no further state transitions, does not deliver any more messages, and does not
broadcast any more messages.
Such a premature halt is modeled by a state transition to the premature halt
state 1. Intuitively, a process can enter the state 1 only after a failure. Thus, no
process reaches 1 as the result of a state transition specified by the protocol:
o+ Vp,Vc,Vs#I?VD: llr(p,c,s,D)$l.
Once a process enters the halt state (i.e., halts), it remains in this state:
o+ Vp,Vc,VD: IIr(p,c,I,D)=I.
Furthermore, once a process halts, it does not broadcast messages.
0 Vp,Vc,VD: llm(P,QI,D)=?
12
2.1.5 Execution of an application protocol
The execution of a protocol II is illustrated in Figure 2.1. The state of each process is
initialized to 5init and its clock to 1. At each clock tick, a process delivers a sequence
of messages, broadcasts a message, changes state and increments its clock.
Initialization			*1
s			5init ? I; c := 1			p initializes its state and clock
Main Loop
do forever
5 := s ? I
D			deliver(;
5 :--H--H s ? I
broadcast(llm(p, c, 5,
llr(p,c,s,D) ? I
c+1
s
c
od
*1
/* p's state may change to I due to a failure
/* Vm ? D : p delivers m at c
/* p's state may change to I due to a failure
p broadcasts a message at c
p changes its state according to II or halts
p increments its clock
Figure 2.1: The execution of II by process p
*1
As mentioned above, a process may prematurely halt execution at any time as a
result of a failure. This is modeled by the non-deterministic assignment statements
used in Figure 2.1. The statement 5 := ?I ? I indicates that the state 5 non-
deterministically changes either to 5' (to model a state transition) or to I (to model
a halt). The statement 5 := s ? I indicates that the state 5 non-deterministically
changes to I (to model a halt) or remains unchanged.
To simplify the presentation, we assume that a process may halt due to a failure
either during initialization, or during execution at the following times:
o+ before delivering messages, or
o+ after delivering and before broadcasting, or
o+ after broadcasting a message.
This assumption does not, however, affect the generality of our results.
To simplify the presentation, we have assumed that II is deterministic; it is
straightforward to extend our model and results for non-deterministic application
protocols.
13
2.1.6 History functions
The following functions describe the execution of an application protocol II:
o+ The function a : ? x I H Q is the history state function--Ha(p, c) is process p's
state when p's clock changes to c; e.g., a(p, 1) = ?init
o+ The function ? : ? x I H (M U b) is the history broadcast function--HP(p, c) is
the message that p broadcasts at clock time c (i.e., when p's clock equals c).
o+ The function ? : ? x I H M+ is the history delivery function--H?(p, c) is the
sequence of messages that p delivers at clock time c.
For all p and c, if ?p, c) = (... ... .?, we write m E ?p, c). We say p delivers
m before m' if m ? S(p, c), m' ? ?p, c') and either c < c1 or (c = c' and ?p, c) --H
....... . .... .?). The sequence of messages S(p, 1) ?(p, 2) ... S(p, c) is the
sequence of messages that p delivers by time c. The sequence of messages S(p, 1)
?(p, 2) ... ?(p, c) ... is the sequence of messages that p eventually delivers.
2.1.7 Histories
Let II be an application protocol, and a, ? and S be history state, broadcast and
delivery functions respectively. The tuple H = (11, a, ?, S) is a history of II. The
tuple A = (a, P, ?) is the application history of H. The tuple B = (P, S) is the
broadcast history of H; B is also the broadcast history of A.
A broadcast history B = (?, ?) is well-formed if:
o+ ? is a history broadcast function and S is a history delivery function.
o+ At all times c, every process p either broadcasts a message of the form (p, c, --H)
or p does not broadcast a message.
Vp, Vc: P(p, c) = (p, c, --H) or P(p, c) =
o+ At all times c, if a process p delivers a message m at time c, then m is times-
tamped less than c.
Vp,Vc: m ? S(p,c) ? ts(m) < c.
Let `3 denote the set of all well-formed broadcast histories.
An application history A = (a, P, ?) is well-formed if:
o+ B = (P, S) is a well-formed broadcast history.
o+ a is a history state function.
o+ Every process is initialized to state ?init or halts before initialization.
Vp: a(p, 1) = 5init or a(p, 1) = I.
14
o+ Once a process halts, it remains halted, and performs no further broadcasts or
deliveries.
= I ? ?(p,ct1) = I andfl(p,c) = ?and?p,c) =
Let A denote the set of all well-formed application histories.
A history H = (11, a, ?, S) is well-formed if:
o+ A = (a, P, ?) is a well-formed application history.
o+ II is a protocol.
o+ At all times c, every process broadcasts messages according to II, or halts before
broadcasting.
Vp,Vc: ?(p,c) = 11m(p,c,o(p,c),?(p,c))] or [P(p,c) = ? A a(p,c+ 1) =
o+ Every process changes state according to II, or halts.
Vp,Vc: [a(p,c+ 1) = llr(p,c,a(p,c),?(p,c))] or [a(p,c+ 1) = I].
Let It denote the set of all well-formed histories.
Let H be a well-formed history of II. intuitively, until a process p halts, p's state
and the messages it broadcasts both depend on the protocol II, p's initial state, and
the messages that p delivers. Thus, a well-formed history models an execution where
faulty processes cannot arbitrarily change state, or broadcast arbitrary messages.
2.1.8 Failure sets
A fa?ltre set is a subset of the set of processes and is denoted by the symbol F. If
p ? F, we say p is faulty; otherwise, p is correct. We use F to denote the set of
correct processes t.e., F = --H
2.1.9 Notation
Let B = (P, ?) be a broadcast history. Silent(B,p, c) is a predicate that holds if p does
not broadcast any messages at or after time c in B formally Vc' > c P(p, c') =
Deaf(B, p, c) is a predicate that holds if p does not deliver any messages at or after
time c in B--Hformally, Vc' > c : S(p, c') =
Let B' = (pi, S') be another broadcast history, X be a subset of processes, and
c be a clock time. For all processes p ? X, for all c' < c if p broadcasts the same
message in B' at c' as it does in B at c', we write Px' =? Px. Formally:
pix=Cpx ? VpcX, Vc', 1<c' <c: P'(p,c')=P(p,c)
If Px' =? Px and c = 00, we omit c and write Pix = Px; similarly, if X = ?, we omit
X and write P' =? P. We define ??` =? Sx analogously, and write Bx' =? Bx if and
???y?fp?I=Cp? A ?=??
15
For all processes p Ei X, for all c' < c if p broadcasts the same message in B'
at c? as it does in B at c', and p does not broadcast any message after time c in 131,
we write Px' =? Px I For all processes p e X, for all c' ? c if p delivers the same
messages in 131 at cl as it does in B at c', and p does not deliver any message after
time c in 13', we write Sx' =? SxIb. Formally:
ptx=cpxI? pxI=cpxandyp?x: Silent(B',p,c+1)
sxl=C?xI? ? six=c?xandvp?x Deaf(B',p,c+1)
We write Bx?=CBx;b ifandonlyif pxl=CpxI? and ?xl=C?xIb
We write X to denote the set ? --H X. When it is clear from the context, we write
p to denote the singleton set fp?, and p to denote fp?.
2.2 Broadcast specifications
A b-oadcast specification ?B is a predicate on broadcast histories and failure sets. If
the predicate ?B(13, F) is satisfied, we say that broadcast history B satisfies broadcast
specification ?B with respect to failure set F.
Reliable broadcast is an example of a broadcast specification. Informally, such a
broadcast requires that all correct processes deliver the same set of messages, includ-
ing all messages broadcast by correct processes. More precisely, reliable broadcast
is defined to be the conjunction of the following validity, agreement, and integrity
properties:
o+ Validity: If a correct process broadcasts a message m, then it eventually delivers
o+ Agreement: If a correct process delivers a message m, then all correct processes
eventually deliver m.
o+ Integrity: For any message m, any process delivers m at most once, and only
if m was broadcast by some process.
Formally, Reliable Broadcast is a predicate on a broadcast history 13 = (p, ?) and
a failure set F, defined as the following conjunction:
Reliable broadcast(B, F) =--H Validity(B, F) A Agreement(B, F) A Integrity(B, F)
Validity(B,F)			=--H			Vp?F\VQVrn#?: m?P(p,c)
Agreement(B, F)			=			Vp ? F, Vc, Vm ? ?: m ? S(p, c) ?
Vq ? F,?c' : m Ei S(q,c')
Integrity(B, F)			=--H			Vp, Vc, Vm $ ?: m ? q?, c)
Vc' $ c: m ? S(p,c') and ?q,ac' : m ? P(q,c')
16
We assume the following: Every broadcast specification requires both the in-
tegrity and the validity properties.
Informally, a timed reliable broadcast is a reliable broadcast with the additional
requirement that message delivery time is bounded. More precisely, timed reliable
broadcast is defined to be the conjunction of the validity, agreement and integrity
properties and the following A-timeliness property.
o+ A- Timeliness: If any process p delivers a message m at time c on p's clock,
then c is at most ts(m) t A.
More formally1 Timed Reliable Broadcast is a predicate on a broadcast history B --H
(p, ?) and a failure set F, defined as the following conjunction:
Timed Reliable Broadcast(B, F) =--H Reliable Broadcast(B, F) A
A- Timeliness(B, F)
A-Timeliness(B,F) Vp,Vc,VmE ?p,c) : c < ts(m)+ A
Note that any broadcast specification ?B can be augmented with the oo-timeliness
property without altering the semantics of the broadcast--Hi. e., VB, VF: ?B(B, F)
if and only if lSB(B, F) A oo-Timeliness(B, F). Thus, without loss of generality, we
assume that all broadcast specifications require the oo-timeliness property.
A broadcast specification ?B that requires the A-timeliness property is said to
have a latency of A. If ?B has a finite latency, it is a timed broadcast; otherwise it is
an untimed broadcast. We assume that all broadcasts have a latency that is greater
than 0.
2.3 Modeling executions
A history models the behavior of the processes in an execution of an application
protocol. However, a single history may model more than one execution. This is
illustrated below for a broadcast history.
Consider a system with two processes, p and q. Let B be a broadcast history in
which p broadcasts m and subsequently delivers m. Suppose however that q never
delivers m.
From the specification of reliable broadcast, Reliable Broadcast(B, fq?) is satis-
fied. This models an execution in which p correctly broadcasts and delivers m but
q fails to deliver m. However, Reliable Broadcast(B, fp?) is also satisfied. Thus, B
also models an execution in which p failed to broadcast m correctly.
Note that B does not model an execution in which both p and q are correct. This
is because Reliable Broadcast(B, ?) is not satisfied.
17
2.4 Unsuccessful broadcasts
Informally, if a process broadcasts a message in a broadcast history, it only means
that the process attempts to broadcast the message. It does not imply that the
process svccessf?lly broadcasts the message.
To explain this more formally, let B = (p, S) be a well-formed broadcast history
that satisfies the specifications of reliable broadcast when the processes in F are
assumed to be faulty; that is, Reliable Broadcast(B, F). Clearly, a faulty process p
p ? can broadcast a message m at some time c, even though no correct
process delivers m; i.e., P(p, c) = m, and for all processes q ? F, for all times c',
m ?
2.5 Successful deliveries
Intuitively, if a process delivers a message, then some process previously broadcast
that message:
Lemma 2.1 Let ?B be a broadcast specification, B = (p, ?) ? B and F be such that
?B(B, F). If some process q delivers a message m of the form (p, c, --H) in B, then
p broadcasts m at time c in B. That is, m = (p, c, --H) and m E S(q, --H) implies that
P(p,c) =
Proof: Suppose process q delivers m = (p, c,--H) in B; i.e., m = (p, c,--H) E
Since ?B requires the integrity property (by assumption) and ?B(B, F) is sat-
isfied, Integrity(B, F) is also satisfied. Since m ? ?(q, --H), the integrity property
requires that for some r and c?, P(r, c') = m; i.e., m was broadcast by some process
r at some local time c' in B. Since B is well-formed and P(r, c') = (p, c, --H), r = p
and ?? = c. Thus, P(p,c) = (p,c,--H); i.e., p broadcasts mat time c in B. 0
2.6 Causal precedence with broadcasts
Lamport defined causal precedence in systems in which processes communicate by
sending and receiving point-to-point messages [Lam78]. His definition must be mod-
ified to define causal precedence in our model, where processes communicate via
broadcasts. Informally, process p at time c causally precedes process q at time c' if
p broadcasts a message at c, and q delivers that message at c', or if p equals q and
c ? c1.
18
Formally, we define causal precedence in broadcast history B as a relation HB
between tuples of the form (p, c), where p is a process, and c is a time.
Definition Let B be a broadcast history. For all tuples (p, c) and (q, c'), (p, c)
causally precedes (q, c') in B, denoted (p, c) HB (q, c') if and onl? if:
o+ p = q and c _ or
o+ P(p, c) $ b and p(p, c) ? S(q, c'), or
o+ ?(r,?) : (p,c) HB (r,c*) and(r,c*) ?B
Note that the HE relation is reflexive-i.e., (p, c) HE (p, c). Other than this
trivial case, the Hj3 relation does not contain "cycles;" this is a consequence of the
following lemma.
Lemma 2.2 Let B ?B. If p $ q and (p,c) HE (q,c'), then c <
Proof: Follows from the well-formedness of the broadcast history B.
E
Lemma 2.3 Let B = (P,?) ? B. For all c', for all processes p, q, p $ q, if c is the
largest time such that (p, c) HE (q, c'), then some process r $ p delivers a message
of the form (p,c,--H).
Proof: Suppose for contradiction that the lemma is false. Let c' be the smallest
time such that for some p, q, p $ q, for some c:
Assumption 1: c is the largest time such that (p, c) HE (q, c').
and furthermore:
Assumption 2: For all processes r $ p, process r does not deliver a message of
the form (p, c,
Since (p, c) HE (q, c') and p $ q, there are two cases to consider. Either:
o+ for some c < c* c+ ? c' : P(p,c*) $ band P(p,c*) ? ?q,c+), or
o+ for some (r, c*), p $ r, q $ r : (p, c) HE (r, c*) and (r, c*) HE (q, c').
The proof precedes by a case analysis.
Case: For some c < c* c+ ? : P(p,c*) $ b and P(p,c*) E S(q,c+): Since
P(p,c*) e ?(q,c+), (p,c*) HE
Subcase: c < ? Since (p,c*) HE (q,c+) and c+ < c', (p,c*) HE (q,c'). Thus,
the assumption that c < c* contradicts Assumption 1, that c is the largest time such
that (p,c) HE (q,c'). Thus, c* must equal c.
Subcase: c+ < c1: Since c is the largest time such that (p, c) HE (q, c'), and
? c' and (p,c) HE (q,c+), c is also the largest time such that (p,c) HE
From this and from Assumption 2, we conclude that c+ contradicts the choice of c'
19
(as the smallest time such that Assumptions 1 and 2 are satisfied). Thus c+ must
equal c'.
Subcase: c* c and c+ --H CI. Thus, P(p, c) $ ? and p(p, c) c ?(q, c'). Since
B is well-formed and p(p, c) $ ?, P(p, c) = (p, c, --H) and hence (p, c, ? ?(q, c').
Thus, we have shown that q delivers a message of the form (p, c, --H), contradicting
Assumption 2.
Case: For some (r,c*), p $ r, q $ r : (p,c) HB (r,c*) and (r,c*) HE
Since q $ r and (r, c*) HE (q, c'), Lemma 2.2 implies that c* < CI. By an argument
similar to the one used in the above "subcase" of c+ < cl we can reach a contradic-
tion to the choice of c'.			c
2.7 Application specifications
A problem is specified by an application specification, a predicate ?A on application
histories and failure sets. If the predicate ?A(A, F) is true, we say that application
history A satisfies specification ?A with respect to failure set F.
When does a protocol solve a particular problem using a particular fault-tolerant
broadcast? For example, what does the assertion "this protocol solves the consensus
problem using reliable broadcast" mean? Intuitively, this means that for all exe-
cutions of the protocol in which processes communicate via reliable broadcast, the
specification of the consensus problem is satisfied. This also means that the proto-
col relies only on the properties required by the specification of reliable broadcast,
and not on the additional properties provided by any particular implementation of
reliable broadcast.
We write II solves a problem with application specification ?A using broadcast
specification ?E (or II solves ?A using ?E) if the following is true. Let H be any
history of II, with broadcast history B and application history A. If F is any subset
of processes, such that B satisfies ?E when the processes in F are assumed to be
faulty, then A also satisfies ?A when the processes in F are assumed to be faulty.
Formally:
Definition Protocol II solves problem ?A using ?E if:
VH = (11, ?, P, S) ? TL: [VF: ?E((P, ?), F) ? ?A((Q P, S), F)]
Chapter 3
Defining Inconsistency and
Contamination
Recall that an application protocol is executed in a "layered" manner; a process ex-
ecutes the application protocol and concurrently executes a broadcast protocol that
uses point-to-point message-passing (Figure 1.2). The behavior of processes at the
send/receive interface is usually specified by the system in which the processes exe-
cute, and cannot be controlled by the application protocol or the broadcast protocol.
Most broadcasts that are described in the literature specify the behavior of correct
processes (at the broadcast/delivery interface), but do not impose any requirements
on the behavior of faulty processes. However, the behavior of the faulty processes at
the broadcast/delivery interface can often be additionally constrained by the broad-
cast protocol being executed. The designer of an application protocol can only take
advantage of such additional restrictions on faulty processes if the specification of the
broadcast used by the protocol is augmented to include these restrictions.
This chapter concentrates on defining general constraints on the behavior of pro-
cesses at the broadcast/delivery interface with respect to the specification of any
fault-tolerant broadcast. In Chapter 5, we use the general definitions presented in
this chapter to derive corresponding specific constraints on the behavior of processes
with respect to atomic broadcast.
In Section 3.1, we define crash behavior; informally, such behavior defines the
broadcast/delivery behavior of processes when they communicate via a fault-tolerant
broadcast in a system where processes may fail by crashing; i.e., every process sends
and receives correctly until it halts. Although crash behavior restricts the possible
broadcast/delivery behavior of the faulty processes, it also permits some undesirably
severe faulty behavior. For example, when processes communicate via atomic broad-
cast, a faulty process that exhibits crash behavior may deliver messages out-of-order
20
21
before it halts.
In Section 3.2, we define a hierarchy of three increasingly stringent "correctness
conditions" (with respect to any broadcast specification) that further restrict the
behavior of faulty processes at the broadcast/delivery interface. In Section 3.3, we
derive the relationship between crash behavior and these correctness conditions.
Informally, when a process violates one of the correctness conditions, we say
it becomes inconsistent. In Section 3.4, we define a hierarchy of three forms of
inconsistency with respect to any broadcast specification.
So far, we have discussed restricting the behavior of a faulty process at its own
broadcast/delivery interface. However, the erroneous behavior of a faulty process
can sometimes be manifested at the broadcast/delivery interface of a correct process;
for example, a correct process may deliver a message that was broadcast by an
inconsistent process. This "spread" of inconsistency from the faulty processes to the
correct processes is called contamination, and is defined in Section 3.5.
3.1 Crash behavior
In this section, we define what it means for a process to exhibit crash behavior
until a particular time in an execution. The formal definition uses the execution's
broadcast history (which describes the broadcast/delivery behavior of the processes)
and a failure set (which defines a subset of processes assumed to be faulty).
We then extend the above definition to describe when a process is crash consistent
in an execution; informally, such a process exhibits crash behavior until it halts. The
formal definition of crash consistency uses the execution's application history (which
includes the broadcast history, and describes the states of the processes) and a failure
set.
We say an application history is a crash history with respect to a failure set if it
models an execution in which every process is crash consistent in that application
history with respect to that failure set. Finally, we define what it means to solve a
problem "assuming crash behavior."
3.1.1 Crash behavior in a broadcast history
Let ?B be a broadcast specification. Let B be a broadcast history, and F be a
failure set such that B satisfies ?B with respect to F. Informally, for all processes
p, for all times c, we say p exhibits crash behavior until time c in B with respect to
p
q
r
22
p exhibits crash behavior until c in B with respect to ?B and F
Broadcast history B, ?B(B, F)
Broadcast history B', ?B(B', F --H
Figure 3.1: Illustrating the definition of crash behavior
?B and F if the following intuitive condition is satisfied: the "global state" 1 of all
the processes at time c in B can be reached in an execution (of a broadcast protocol
that implements ?B) when the processes in F --H p are assumed to be faulty (i.e., the
processes in F and process p are assumed to be correct).
Definition Let B E 13 and F be sttch that ?B(B, F). Process p exhibits crash
behavior until time c in B with respect to ?B and F if there is a Ei 13 such that
? B and ?B(B', F --H p). B' is called a crash extension of (B,p, c) with respect
to ?B and F
If B' is a crash extension of (B,p, c) with respect to ?B and F, then B and B'
are indistinguishable until time c (and hence reach the same "global state" at time
c). Furthermore, B' satisfies ?B when the processes in F and process p are assumed
to
be correct.
We sometimes write "p exhibits crash behavior until c" instead of "p exhibits
crash behavior until c in B with respect to ?B and F" when the broadcast history
B, the broadcast specification ?B and the failure set F are clear from the context.
1Clearly, the global state at time c is given by the broadcast/delivery behavior of
the processes until time c.
23
3.1.2 Crash consistent behavior in an application history
We now define what it means for a process to be crash consistent in an application
history with respect to a broadcast specification and a failure set, and for an appli-
cation history to be a crash history with respect to a broadcast specification and a
failure set.
Let A = (a, P, ?) and B = (P, S) be an application and broadcast history respec-
tively. LB(B,p) is a function that denotes the time of p's last broadcast in B, and
is defined as follows:
LB(B,p) =
A
ctock
a
6:
p
oo ifpbroadcastsinfinitelyoften
minfc Silent(B,p,c+ 1)? otherwise
(a,P, 6:)
H?t(A, p)
23
20			21			22			___			24
LB(B,p)			LVT(A,p)
Figure 3.2: A segment of p's history
Halt(A, p) is a function that denotes the time at which p halts, and is defined as
follows:
HitA			-			0o			if p does not halt
a,p --H minfc a(p, c +1) = 1)1 otherwise
Figure 3.2 illustrates these times in a sample history (LVT(A, p) is defined below).
The symbol "--H" denotes a message or set of messages not equal to ?, or a state not
equal to I.
Let p be a process that halts at time c in A--Hi. e., Halt(A,p) = c. For p to be
crash consistent in A, it is natural to require that p exhibit crash behavior until time
c. However, if p halts before broadcasting at time c, it does not change its state to
incorporate any messages that it delivers at time c. In other words, p's actions while
24
its clock reads c cannot affect the state of any process, including p's own state. In
such a case, it is reasonable to require that p only exhibit crash behavior until time
c --H 1.
The time max(Halt(A,p) --H 1, LB(B,p)) is called p's last visible time in A and
denoted LVT(A,p). Note that if Halt(A,p) = 00, then LVT(A,p) = 00. Note
also that if p broadcasts a message at some time c, and then halts at time c, then
LVT(A,p) is defined to be c.
Definition Let A ? A, B = (P, ?) be the broadcast history of A, and F be such that
?B(B, F).
Process p is crash consistent in A with respect to ?B and F if for all c _
LVT(A,p), p exhibits crash behavior until time c in B with respect to ?B and F.
A is a crash consistent history (crash history) with respect to ?B and F iffor all
p, p `s crash consistent in A with respect to ?B and F
3.1.3 Solving problems assuming crash behavior
In Chapter 2, we gave a formal definition of a protocol solving a problem using
a broadcast specification. We extend that definition to formalize the meaning of
statements such as "this protocol solves a particular problem using a particular
fault-tolerant broadcast assuming crash consistent behavior." Formally, we say:
Definition Protocol II solves ?A using ?B assuming crash consistent behavior if:
VH = (WQP,? ?
WF: ?B((P,?),Th
and
(a, P, ?) is a crash history with respect to ?B and F
??((a, p, ?), F)]
To illustrate the definitions presented so far, we present a simple application
protocol that solves the problem of atomic updates using timed reliable broadcast
assuming crash consistent behavior. In the atomic updates problem, any process
may propose an update to a replicated variable at any time, and processes commit
updates as follows:
A U-1 : If a correct process proposes u, then it eventually commits u.
A U-2 : If a process commits u, then all correct processes eventually commit u.
A U-3 : If a process commits u before u', and any process p commits u1, then p also
commits u before u1.
A more formal description of the atomic update problem, as a predicate Atomtc
Update(A, F) on application histories A and failure sets F, is omitted for simplicity.
25
/* Process p commits updates at time c
U := fu p delivered (--H,c--H A,u)i
L			sort(U)			/? in lexicographic order
commit each u ? L in order
/? Process p proposes update u at time c
broadcast (p, c, /* using timed reliable broadcast
*1
*1
Figure 3.3: Protocol that uses timed reliable broadcast with latency A
Executed by process p at time c
Figure 3.3 describes an application protocol that solves the atomic update prob-
lem when it uses timed reliable broadcast with latency A and assumes crash consis-
tent behavior. Although the protocol is defined operationally, it can also be described
in terms of a protocol transition function and a protocol message function. Such a
description is omitted for simplicity.
The protocol relies upon the following constraint on the behavior of the processes.
If any process p commits an update u at time c, then for all correct processes q, the
set of messages with timestamp c --H A that p delivers by time c is identical to the set
of messages with timestamp c --H A that q delivers by time c.
Lemma 3.1 shows that the above constraint is indeed a consequence of crash
consistent behavior with respect to timed reliable broadcast. The proof of correctness
of the protocol in Figure 3.3 follows easily from Lemma 3.1, and hence is omitted.
Lemma 3.1 Let ?B be the specification of timed reliable broadcast with latency A.
Let H be a well-formed history of the protocol in Figure 3.3, with application history
A and broadcast history B. Suppose F is a failure set such that A is a crash history
with respect to ?B and F, and ?B(B, F) is satisfied. For all processes p, ifp commits
an update at some time c in A, then for all correct processes q (i.e., q ? F), the set
of messages with timestamp c --H A that p delivers by time c is identical to the set of
messages with timestamp c --H A that q delivers by time c.
Proof: Suppose some process p commits an update at some time c in A. Thus,
p changes state at time c, and hence c ? LVT(A,p).
Since A is a crash history and c ? LVT(A,p), p exhibits crash behavior until c in
B with respect to ?B and F. By definition of crash behavior, there is a well-formed
26
history B' such that B' =? B and ?B(B', F --H
Let q be any correct process (i.e., q ? F). Let Mp and Mq be the sets of messages
with timestamp c --H A that are delivered in B' by time c by p and q respectively. We
first show that Mp Mq.
Let m be any message in M?; i.e., p delivers m by time c in L?' and ts(m) = c--HA.
Since ?B(B', F --H p) is satisfied, and neither p nor q is in F --H p, the agreement
property required by ?B implies that q also delivers m in B'. Furthermore, the A-
timeliness property required by ?B implies that q delivers m by time c in B'. Thus
m ? Mq. Similarly, if m C Mq, we can show that m ? Mp. Hence, Mp = Mq.
Since B' =? B, until time c every process delivers the same sequence of messages
at the same time in B and B'. Thus, Mp and Mq are also the sets of messages with
timestamp c --H A that are delivered in B by time c by p and q respectively. Since
Mp = Mq, we conclude that the set of messages with timestamp c --H A that p delivers
by time c is identical to the set of messages with timestamp c --H A that q delivers by
time c.			0
The protocol in Figure 3.3 does not solve the atomic update problem if processes
are not crash consistent. Specifically, there is a well-formed history H of the protocol,
with application history A and broadcast history B, and a failure set F, such that
Timed Reliable Broadcast(B, F) is satisfied, A is not a crash history with respect to
?B and F, and Atomic Update(A, F) is not satisfied.
Consider the following execution of the protocol in a system with two processes
p and q. Process q proposes u at time c by broadcasting m = (q, c, u), and proposes
u' at time c' > c by broadcasting m' = (q, c', u'). Process g delivers m at some time
after c, delivers m' at some time after c', and hence commits u at c + A and u' at
time c' + A. Process p does not deliver m, and delivers message m1 at some time
after c'; hence p does not commit u and commits ?? at time c' + A.
Let ?B be the specification of timed reliable broadcast with latency A. Let H be
the history of the above execution; it is easy to show that H is well-formed. Let A
and B be the application and broadcast histories of the history H, and let F = fp?.
By construction, it is clear that ?B(B, F) is satisfied.
Since p commits u' at time c' + A, c' + A < LVT(A,p); since c < c', c + A <
LVT(A,p). We show below that A is not a crash history with respect to ?B and F
by proving that p does not exhibit crash behavior until c + A in B with respect to
?B and F.
Consider any well-formed broadcast history B' such that B' c+A B. In both
B and B', q broadcasts m at time c, q delivers rn by time c + A and p does not
deliver m by time c + A. Thus, the agreement and A-timeliness properties required
27
by ?B imply that p and q cannot both be correct. Since q is correct (i.e., q ?
?B(B', F --H p) cannot be satisfied.
Thus, for all B' such that B' c+=A B, the predicate ?B(B', F --H p) is not satisfied.
Hence, p does not exhibit crash behavior until c + A in B with respect to ?B and
F, and A is not a crash history with respect to ?B and F.
Finally, we show that the application history A does not satisfy the specification
of the atomic update problem with respect to the failure set F. From the description
of the execution given earlier, it is clear that q commits u before u', and p commits
u' and never commits u. Thus, the predicate A U?S(A, F) is not satisfied, and hence
the predicate Atomic Update(A, F) is also not satisfied.
3.2 Correctness conditions: Restrictions on
faulty processes
Although crash behavior does indeed restrict the broadcast/delivery behavior of the
faulty processes, it also permits some unexpectedly severe faulty behavior. Intu-
itively, this is because even if a process has already failed by time c, its failure may
not be detectable in the "global state" of the processes at time c.
In this section, we define a hierarchy of three increasingly stringent "correctness
conditions" that proscribe the undesirable behavior permitted by crash behavior.
Intuitively, these conditions are based on the following observation: the failure of a
process p by time c may become obvious when p's behavior until time c is compared
with the behavior of the correct processes in the entire execution, even if p's failure
cannot be detected in the "global state" at time c.
3.2.1 An anomaly with crash behavior
When processes communicate via atomic broadcast, a "natural" assumption is that if
a process p exhibits crash behavior until some time c, then p cannot deliver messages
out-of-order. However, this assumption is incorrect, as we demonstrate below in
an execution of a coordinator-based atomic broadcast protocol. We show that even
though a faulty process p exhibits crash behavior until some time c, p's deliveries by
time c are out-of-order; that is, p's failure is manifested by time c.
Consider the coordinator-based atomic broadcast protocol informally outlined
below. When a process intends to broadcast a message m, it first sends m to a coor-
dinator. The coordinator delivers messages (in any order it chooses), and periodically
informs the other processes of this message delivery order; the other processes deliver
messages according to this order. When the coordinator crashes, another process
28
takes over as coordinator. The new coordinator must first order any messages that
the old coordinator did not order, and then begin ordering new messages.
Such coordinator-based atomic broadcast protocols are frequently used in practice
because they are message efficient in the absence of failures [BSS9O]. However, such
a protocol can lead to an execution in which a faulty process delivers a message m
before a message even though the correct processes deliver m' before m.
Consider the following execution of the protocol where the coordinator p is the
only faulty process: p delivers m before m' and then prematurely halts before in-
forming any other process that m should be delivered before m'. Thus, the new
coordinator, q, cannot know the delivery order chosen by p (or even if p actually
delivered m or m'). Hence, q decides that m' should be delivered before m, and in-
structs the other processes to do the same. In this execution, the faulty coordinator
delivers messages out-of-order.
Let B be the broadcast history of the above execution (partially illustrated in
Figure 3.4), and let F = fpl It is clear that B satisfies atomic broadcast with
respect to F. We show below that p exhibits crash behavior until time c in B with
respect to atomic broadcast and F
Coordinator de1?ver(m)			deliver(m')
p
q
New Coordinator			deliver(m)
deliver(m')
p exhibits crash behavior until cin 1?
Figure 3.4: Portion of broadcast history B
Suppose B`is a well-formed broadcast history, such that B' =? B, and every
process other than p delivers m after time c, and then delivers m'. B' is partially
illustrated in Figure 3.5. Clearly, B' satisfies atomic broadcast when there are no
faulty processes. Thus, B' is a crash extension of (B, p,c) with respect to atomic
broadcast and F = fp?, and hence p exhibits crash behavior until time c in B with
respect to atomic broadcast and F.
The broadcast history B' used in the above argument actually models the follow-
29
p
q
deliver(m)			dehver(m')
c			?ehver(m)
Figure 3.5: Portion of broadcast history B'
deliver(m')
ing execution of the coordinator-based atomic broadcast protocol: process p does not
halt in the execution described earlier, and instructs the correct processes to deliver
m before m'. In such an execution, all the correct processes deliver m before m', and
hence p does not deliver messages out-oorder by time c.
In summary, we have presented a broadcast history B that satisfies atomic broad-
cast with respect to a failure set F. We show that a faulty process p Ei F exhibits
crash behavior until some time c in B with respect to atomic broadcast and F, even
though the message sequence p delivers by time c is "clearly faulty." Intuitively, this
shortcoming exists because crash behavior examines the broadcast/delivery behavior
of the process only until time c, and does not consider any deliveries made after time
3.2.2 Delivery correctness
We propose delivery correctness (the weakest of the three correctness conditions we
define) to preclude anomalies with crash behavior, such as the one discussed in the
previous section.
Let B be a broadcast history that satisfies a broadcast specification ?B with re-
spect to a failure set F. Informally, process p delivers correctly tntil time c in B with
respect to ?B and F if the sequence of messages p delivers by time c is "consistent"
with the sequence of messages that the correct processes eventually deliver in B.
That is, when compared with the deliveries made by the correct processes during the
entire history, p's deliveries until time c could have been made by a correct process.
As in the case of crash behavior, the formal definition of delivery correctness is
given using another broadcast history B'. Intuitively, B' is restricted in such a way
that if ?B(B', F --H p) is satisfied, we can conclude that p delivers correctly until time
c in B with respect to ?B and F. That is, B' is used to "verify" that p delivers
30
correctly in B.
We say p delivers correctly until c in B with respect to ?B and F if there is a
well-formed broadcast history B' that satisfies ?B with respect to (F --H p) (i.e., p
remains correct in B'), and:
1. The correct processes broadcast and deliver the same messages at the same
time in both Band B'--Hi.e., BF' = BF.
2. Until time c, the faulty processes, except p, broadcast and deliver the same
messages at the same time in both B and B'--Hi. e., BF'?p =? BF?p.
3. Until time c, process p delivers the same messages at the same time in both B
and B'--Hi. e., =?
4. Process p's broadcasts in B' are a subset of its broadcasts in B--Hi.e., Vc'
P'(p, c') = P(p, c') or
Intuitively, since we are determining if p delivers correctly in B, an unsuc-
cessful broadcast by p in B need not be included in B'. In general, all of p's
unsuccessful broadcasts in 1? cannot be omitted from B1.
For example, suppose that p broadcasts m in B, and m is delivered by a faulty
process q # p by time c. Since =? ?-?, q also delivers rn in B1. Since B'
is well-formed, Lemma 2.1 implies that p broadcasts m in B'
5. Suppose ?B is a timed broadcast with latency A. Until time c--HA, p broadcasts
the same messages at the same time in both B and ???????, P!p CA
Since ?B is a timed broadcast with latency A, ?B requires both the validity
and the A-timeliness properties (by assumption). Intuitively, stipulating that
?pICCA ?p corresponds to "verifying" that p's deliveries by time c satisfy both
these properties. This is explained below.
If ?B(B', F --H p) is satisfied, then the validity and A-timeliness properties imply
that p's deliveries by time c in B' must include all the messages that p broadcast
by time c --HA in B'. Since =? Sp (item 3) and Pp' c--H=A ??, we conclude that
p's deliveries by time c in B include all the messages that p broadcast by time
c --H A in B. That is, p's deliveries by time c in B satisfy the validity and
A-timeliness properties.
Definition Let ?B be a broadcast specification with latency A 2 Let B = (p, S) Ei 13
and F be such that ?B(B, F). Processp is delivery correct (D-correct) until time c ?n
2Recall an untimed broadcast has an infinite latency.
31
B with respect to ?B and F if there is a = (?`, ?) ? 13 such that ?B(B', F --H
and:
o+ B?F' = BF and BF'?p =? BF?p
o+ ?=c&p and P1p CA p? and Vc'>c--HA:P'(p,c')=P(p,c') or?.
B' is called a D-extension of (B,p, c) with respect to ?B and F
Let B' be a D-extension of (B, p, c) with respect to ?B and F. If p is correct
? F) then B?F' = BF and BF' =? BF.
3.2.3 Visible-broadcast/delivery correctness
Suppose a process p broadcasts a message m before broadcasting a message
Intuitively, if the broadcast of m' is successful, it is desirable that the broadcast of m
should also be successful. Indeed, it is reasonable to expect such behavior in practice.
? However, a process may be "correct" by the definition of delivery correctness, may
fail to successfully broadcast a message m, and may successfully broadcast a later
message m'. This undesirable behavior, illustrated below, is prevented by visible-
broadcast/delivery correctness.
Consider a system with two processes, p and q. Let B be a broadcast history
(see Figure 3.6) in which by time c, p broadcasts message m followed by message
Suppose that p and q eventually deliver m', and neither p nor q delivers m. Clearly,
B satisfies the specification of atomic broadcast when F = (p?--Hi.e., process p is
faulty. The broadcast history B' illustrated in Figure 3.6 is a D-extension of (B, p, c)
with respect to ?B and F. Thus, p is D-correct until c in B with respect to ?B
and F, even though it fails to successfully broadcast a message m but successfully
broadcasts a later message
Let B be a broadcast history that satisfies ?B with respect to failure set F. In-
formally, process p is visible-broadcast/delivery correct until time c in B if, whenever
any message p broadcasts by time c is delivered by a process other than p (i.e., "vis-
ible" to a process other than p), then all of p's previous broadcasts are successfully
delivered. Formally, there is a broadcast history B' ? 13 that satisfies ?B with re-
spect to (F--Hp), where B1isa D-extension of (B,p, c) with the additional restrictions
that:
1. Process p's broadcasts in ?? are a prefix of p's broadcasts in B.
3Broadcast protocols sometimes enforce such behavior by "piggybacking"
messages.
32
Broadcast history B
Faulty broadcast(m) broadcast(m')
p
delive?m')
q
Correct
c delive?m')
Broadcast history B': a D-extension of (B,p, c) with respect to ?B and F
Correct			broadcast(m')			delive?m')
p
q
Correct
c delive?m')
Figure 3.6: Broadcast histories B and B'
2. If, by time c, p broadcasts a message rn that is delivered by any process q $ p,
then m is included in B'
Let Visible(B, p, c) denote the largest time ?? < c, such that some process delivers
a message that process p broadcasts at time c' Formally, if B = (P, S), then:
Visible(B,p,c) = maxfc' c' < c, P(p,c') $ ? and ?q $ p: P(p,c') ?
Definition Let B = (p, ?) ? 13 and F be such that ?B(B, F). Process p is visible-
broadcast/delivery correct (VBD-correct) until time c in B with respect to ?B and F
if there is a = (p',S') ? 13 such that ?B(B',F--Hp) and:
o+ B?F' = BF and BF'?p =? 1?F-p
o+ =? ?? and Pp' =c* p?;? 4, where c* > max(c--H A, Visible(B,p,c)).
4Recall: ?pI=C$?p? ? ?pI=C*?p and Silent(B',p,c*+1).
33
B' is called a VBD?extension of (B,p, c) with respect to ?B and F
Let B' be a VBD-extension of (B, p, c) with respect to ?B and F. The following
lemma generalizes the observation that if p broadcasts m at any time c' in B, and
m is delivered by a correct process q in B, then until time c', p broadcasts the same
messages at the same time in B and B1.
Lemma 3.2 Let B = (P,?) ? B and F be such that ?B(B,F). Let B' =
be a VBD-extension of(B,p,c) with respect to ?B and F. Let q $ p and c' be such
that either q ? F or c' < c. If q delivers a messa?e of the form (p, c', --H) in B, then
3c+ > c' such that Pp' ?=+ PpI?
Proof: Suppose q $ p delivers m = (p, c',--H) in B. Since ?B(B, F) is satisfied,
Lemma 2.1 implies that p broadcasts m at time c' in B--Hi.e., P(p, c') =
Case: c' < c: By hypothesis, B' is a VBD-extension of (B,p, c), and hence
Pp' ffiffi p?? for some c* > Visible(B,p, c). Since p broadcasts mat c' < c in Band m
is delivered by process q $ p in B, Visible(B,p, c) > c'. Since c* > Visible(B,p, c),
c* > c'. Thus, (by choosing c+ = c*) we conclude that there is a time c+ > c' such
thatP' C+p ?
Case: q ? F: By hypothesis, B1is a VBD-extension of (B, p, c), and hence SLF = SF
and ?E(B1, F --H p). Since q delivers m in Band SLF = 8F and q ? F, q delivers m in
B1. Since ?B(B', F --H p), Lemma 2.1 implies that P1(p,?) =
By hypothesis, Pp' =? Ppt? for some c*; that is, Pp' ?? pp and Silent(B',p,c* + 1).
Since p'(p, c') = m $ ? and Silent(B',p, c* + 1), c1 < c*. Thus, (by choosing c+ =
we conclude that there is a time c+ > c' such that Pp' %ffi PpI? El
3.2.4 Broadcast/delivery correctness
Broadcast/delivery correctness, the most stringent correctness condition we define, is
a natural strengthening of visible-broadcast/delivery correctness. Informally, process
p is broadcast/delivery correct until time c in B with respect to ?B and F if p's
broadcast/delivery behavior by time c is "consistent" with the broadcast/delivery
behavior of the correct processes in the entire broadcast history B. Formally:
Definition Let B = (p, 5) Ei 13 and F be such that ?B(B, F). Process p is broad-
cast/delivery correct (BD-correct) until time c in B with respect to ?B and F if there
is a = (P',S') ? 13 such that ?B(B',F --Hp) and:
34
o+			and BF'?p =? BF?p, and
o+ BpI=CBp and ??>c:
B' is called a BD-extension of (B,p, c) with respect to ?B and F.
Note that if B' is a BD-extension of (B, p, c) with respect to ?B and F, then
B?F' B?F and BF' =? BF.
Let B be a broadcast history and F be such that ?B(B, F). If p is a correct
process (p ? F) then p is always broadcast/delivery correct (and hence also visible-
broadcast/delivery and delivery correct).
Lemma 3.3 Let B ? 13 and F be such that ?B(B, F). For all correct processes p
? F), for all c, p is BD-correct until c in B with respect to ?B and F.
Proof: Consider any correct process p and any time c. Clearly, B is a BD-
extension of (B,p, c) with respect to ?B and F. Thus, p is BD-correct until c in B
with respect to ?B and F.			0
Although it is desirable to require that all processes are BD-correct, such a require-
ment may be unreasonable in practice. In most broadcast protocols, the broadcast
of a message by a broadcast protocol requires the execution of many instructions,
including several sends and receives. Thus, when a process p attempts to broad-
cast a message, a "thread" of execution is "forked" within the broadcast protocol
layer. This thread executes (or appears to execute) concurrently with p's application
protocol, and the broadcast is not complete until the thread terminates.
Thus in any execution, a process can broadcast a message at time c, update local
variables, and then halt before the successful termination of the broadcast thread;
i. e., before the successful completion of the broadcast. In general, such a process
would not be BD-correct after time c.
Informally, to ensure that processes are BD-correct, whenever a process p broad-
casts a message m, p must be "prevented" from changing state until it is guaranteed
that the correct processes will eventually deliver m. Such "blocking" protocols are
often slow, and hence undesirable.
3.2.5 X-correctness
Where appropriate, we present our results in terms of "X-correctness," a "generic"
correctness condition. Thus, any result stated in terms of X-correctness is valid for
all three correctness conditions defined earlier.
35
3.3 The correctness hierarchy
This section examines the relationship between crash behavior, delivery correotness,
visible-broadcast/delivery correctness and broadcast/delivery correctness.
Let B e B and F be such that ?B(B, F). Figure 3.7 illustrates the correctness
hierarchy for process p at time c. For example, consider the arrow labeled "Fifo-
independence property" and "Theorem 3.5" from the box marked "VBD-correct be-
havior?' to the box marked "crash behavior." This indicates that Theorem 3.5 proves
that if ?B has the fifo-independence property (defined later on), and if p is VBD-
correct until c in B with respect to ?B and F, then p exhibits crash behavior until
c in B with respect to ?B and F.
By definition
BD-correct
By definition
-? __ Fifo-independence property
Theorem 3.4 - -
VBD-correct			Theorem 3.5
By definition
D-correct			Theorem 3.6
property
Independence
Crash behavior
Figure 3.7: The correctness hierarchy
When it is clear from context which broadcast specification ?B, broadcast history
B, and failure set F are being considered, we write "p is X-correct until c" when we
mean "p is x-correct until c in B with respect to ?B and F."
Suppose pis BD-correct until c. From the definitions of the correctness conditions
and of crash behavior, we conclude that pis both VBD-correct and D-correct until c,
and p also exhibits crash behavior until c.
Suppose p is VBD-correct until c. Clearly, p is also D-correct until c. Theorem
3.4 shows that p is also BD-correct until c in certain scenarios; for example, if p
broadcasts a message m at time c in B, and some correct process q delivers m in B.
36
This is indicated by the dashed line in Figure 3.7. Theorern 3.5 shows that p also
exhibits crash behavior until c provided the broadcast specification ?B satisfies the
fifo-independence property. This property is described below in Section 3.3.1.
Suppose p is D?correct until c. Theorem 3.6 shows that p also exhibits crash
behavior until c provided ?B satisfies the independence property. This property is
described below in Section 3.3.1.
3.3.1 The independence and fifo-independence properties
Intuitively,anuntimedbroadcast specification?B hasthe independence property if
any correct process may broadcast a message at any time, and the order in which any
correct process delivers messages is "independent" of the order in which the messages
were broadcast,
More formally, let B be a broadcast history and F be a failure set such that
?B(B, F) is satisfied. Suppose a correct process p does not broadcast a message at
time c in B. Then, for all c' > c, there is another broadcast history B1, such that
?B(B', F) and:
o All processes broadcast the same messages at the same time in B and B',
except that p also broadcasts a new message m = (p,c,--H) at time c in B1.
o+ Until time c?, the processes deliver the same messages at the same time in B
and B'
The first item above corresponds to the intuitive assertion that a correct process
can broadcast a message at any time. This is because there are no restrictions on the
correct process p, or the time c, or the message m, other than those inherent in the
model namely, at time c, p can broadcast at most one message, and that message
must be of the form (p,c,
We now show that the above informal definition corresponds to the intuitive
assertion that the delivery order is independent of the broadcast order. In particular
we show that a correct process q may deliver some message m' before m in the
broadcast history B', even though a correct process pbroadcasts m before m1in B'.
Suppose that p broadcasts some message m' after time c in B. The first bullet
item above implies that p also broadcasts m' after time c in B'; thus p broadcasts
m before m' in B'
Suppose some correct process q delivers m1 by time c in B. The second bullet
item above implies that q also delivers m1 by time c' in B1. Suppose that q also
delivers m in B1. ? Clearly, the second bullet item above implies that q delivers m
5Since p and q are both correct, and ?B(B', F) is satisfied, if ?B requires the
agreement property, then q must deliver m in B'
37
after time c in B'; i.e., after delivering m1.
Hence, in the broadcast history B', a correct process q delivers m' before m, even
though a correct process p broadcasts m before m'. This corresponds to the intuitive
assertion that the delivery order is independent of the broadcast order.
Examples of untimed broadcast specifications that have the independence prop
erty are reliable broadcast and atomic broadcast. (In Chapter 5, we prove that
atomic broadcast does indeed have the independence property.)
Now suppose ?B is a timed broadcast. Intuitively, ?B has the independence
property if the following condition is satisfied. When a process p broadcasts m and
m', the delivery order is independent of the broadcast order provided that m and m'
were broadcast less than A time apart.
The independence property is formally defined below as a "closure property" of
broadcast histories that satisfy ?B We use the following notation:
P except P'(p, c) =
P'(p, c) = m and V(q, c') $ (p, c) : P'(q, c') = P(q, c')
Definition Broadcast specification ?B has the independence property if for all B --H
(P, ?) Ei 13, for all F such that ?B(B, F), for all correct processes p (p ? F), for
all times c such that P(p, c) = ?, for all messages m of the form (p, c, --H), and for
all times c' c ? c' < c + A --H 1, there is a = (?`, ?) e 13, such that ?B(B', F),
? and [P' = P except P'(p,c) =
Let B', ?? and m be as in the above definition. If m is delivered in B' by some
process q, then q delivers m after time c' in B'.
The fifo-independence property is weaker than the independence property. As
in the case of the independence property, a broadcast specification has the fifo-
independence property if any process may broadcast a message at any time. How-
ever, with fifo-independence, if a correct process p broadcasts m before m1, then a
correct process q cannot deliver ?? before m. Clearly, any broadcast specification
that has the independence property also has the fifo-independence property. Exam-
ples of broadcast specifications that have the fifo-independence property are causal
broadcast and causal atomic broadcast.
Definition Broadcast specification ?B has the fifo-independence property if for all
B = (p, ?) Ei 13, for all F such that ?B(13, F), for all correct processes p (p ?
for all times c such that Silent(B,p, c), for all messages m of the form (p, c, --H), and
for all times c', c < c' < c + A --H 1, there is a B' = (P', 8') ? 13, such that ?B(B', F),
8' =? 8 and [P' = P except P'(p, c) =
38
3.3.2 Hierarchy theorems
Theorem 3.4 defines when a VBD-correct process is BD-correct.
Theorem 3.4 Let B = (p, S) Ei 13 and F be such that ?B(B, F) For all processes p,
for all times c, if p is VBD-correct until c and some process q $ p delivers a message
m of the form (p, c, --H), then p is BD-correct until c.
Proof: Suppose p is VBD-correct until c in B. Let B' = (P', S') be the VBD-
extension of (B,p, c) given by the definition of p being VBD-correct until c in B.
Thus, B' ? 13, ??(B',F--Hp), B?F'= BF, BF'?p =? BF?p and Sp' =?
Suppose q $ p delivers m = (p, c, ) in B; Lemma 3.2 implies that there is a
c* > c such that Pp' ffiffi PpI? Since = BF and BF'?p =? BF?p and =? Sp
and Pp' =c? P?I? , we conclude that B' is a BD-extension of (B,p,c), and that p is
BD-correct until c in B.			0
Suppose that the processes communicate using a fault-tolerant broadcast that
has the fifo-independence property. Theorem 3.5 shows that VBD-correct behavior
implies crash behavior; i. e., VBD-correctness is stronger than crash behavior. Re-
call that a broadcast specification that has the independence property also has the
fifo-independence property. Thus Theorem 3.5 is also valid for those broadcast spec-
ifications that have the independence property.
Theorem 3.5 Suppose ?B has the fifo-independence property. Let B = (p, S) E 13
and F be such that ?B(B, F). If p is VBD-correct until time c, then p exhibits crash
behavior at time c.
Proof: Suppose p is VBD-correct until time c in B. To show that p exhibits crash
behavior at time c in B, we prove the following claim:
Claim: If there is a = (p*, S*) ? 13, such that:
o+ ??(B*,F?p)and?p?*=C?andS*=CSand??=C*?pJ? for some c* > c --H A
then there is a 13 such that B+ =? B and ?B(B+, F --H p). In other words,
B+ is a crash extension of (B,p, c) and p exhibits crash behavior until c in B with
respect to ?B and F
Suppose the claim is true. It is straightforward to show that the VBD-extension
of (B,p, c) (given by the definition of p being VBD-correct until c in B) satisfies the
hypothesis of the above claim. Thus, p exhibits crash behavior until c in B with
respect to ?B and F.
39
Proof of claim: Let B* and c* be given by the hypothesis of the claim. The proof is
by induction on k = c --H
Basis: Suppose k ? 0 i e_ c* > c. Since =? P?I? (by the claim's hypothesis)
and c* > c, Pp* =? Pp Since S* =? S and Pp?* =? ? (by the claim's hypothesis), B* =? B.
By the claim's hypothesis ?B(B*, F --H p) and B* ? !3, and hence we conclude that
choosing B+ = B* proves the claim.
Ind?ction Hypothesis: Assume the claim holds for c --H c* < k where k> 1.
Indtction: Suppose c --H c* = k. We show that there is a B' = (p,, S1), such that
? !3 and ?B(B', F --H p) and Pp?' =? ? and S' =? S and Pp? ?=+i ???? (3.1)
Since c --H (c* + 1) < k, the claim follows from the induction hypothesis.
There are two cases to consider.
Case: P(p,c* + 1) = & Sincep; ?*PpI? andP(p,c*+1) = ?Qp* C%+lppI?. By
choosing B' = B* we satisfy equation 3.1, thus completing the proof.
Case: P(p, c* + 1) $ ?: B' is constructed as follows. Since =c* p??? ,p does not
broadcast any messages after time c* in B*--Hi.e., Silent(B*,p, c*+l). Since c*+1 < c
(by induction) and c < c* + A (by the claim's hypothesis), c* + 1 < c < c* + A; i.e.,
(c* + 1) < c < (c* + 1) + A --H 1.
Since B is well-formed and P(p, c* + 1) $ ?, P(p, c* + 1) is of the form (p, c* +
1,--H). By the theorem's hypothesis, ?B has the fifo-independence property. Since
?B(B*, F --H p) and Silent(B*,p, c* + 1), and (c* + 1) < c < (c* + 1) + A --H 1, the
definition of fifo-independence implies that there is a broadcast history I?' E B such
that ?B(B', F --H p), and ? =? ? and [P1 = P* except P'(p, c* + 1) = P(p, c* + 1)].
The proof is complete if we show that B' satisfies equation 3.1. By choice of B'j
Band ?B(B', F --H p). Since ? =? ? and ? =? ? and [P1 = P* except P1(p, c* +
1) = P(p,c* + 1)] and Pp* =? ?, it is easy to show that ? =? ? and PpL =C
Furthermore, since cffi PpI? (by the claim's hypothesis), we can also show that
Pp' =c* p?, P'(p, c* + 1) = P(p, c* + 1)] and Silent(B',p, c* + 2); that is Pp' c*=+1 PpI?
We have shown that B' satisfies equation 3.1, thus completing the proof. C
Suppose the processes communicate by a broadcast that satisfies the indepen-
dence property. Theorem 3.6 shows that D-correct behavior implies crash behavior;
i. e., D-correctness is stronger than crash behavior.
40
Theorem 3.6 Suppose ?B satisfies the independence property. Let B ? B and F
be such that ?B(B, F). If p is D-correct until time c, then p exhibits crash behavior
at time c.
Proof: Similar to Theorem 3.5.
3.4 Process consistency: Restrictions on faulty
processes
0
Intuitively, a process p becomes inconsistent when its state reflects p's violation of
a correctness condition. Thus, we formally define inconsistency using application
histories, which contain both state information and broadcast/delivery information.
Let A (a, P, ?) and B = (p, ?) be an application and broadcast history re
spectively. Let p be a process that halts at time c in A--Hi.e., Halt(A,p) = c. For
p to be X-consistent in A, it is natural to require that p be X-correct until time c.
However, as in the case of crash behavior, if p halts before broadcasting at time c, it
is reasonable to require that p be X-correct only until time c --H 1. This leads to the
following definition:
Definition Let A ? A, B = (p, ?) be the broadcast history of A, and F be such that
?B(B, F). Process p is X-consistent in A with respect to ?B and F if and only if
for all c ? LVT(A,p), p is X-correct until time c in B with respect to ?B and F.
3.5 Process contamination: Restrictions on
correct processes
Intuitively, contamination is the "spread" of "incorrectness" from faulty processes
to correct processes. Informally, a correct process p is x-contamin&ed at time c if
the causal past of the tuple (p, c) includes a process that is not X-correct. That is,
a correct process p is X-contaminated at time c in B with respect to ?B and F if
and only if there is a process q and a time c' such that (q, c') ?B (p, c) and q is not
X-correct until c' in B with respect to ?B and F. Note that a faulty process can
never be contaminated.
Definition Let B Ei 13, F be such that ?B(B,F), and p be a correct process. Let
B Ei 13 and F be such that ?B(B, F). Suppose a process p is correct.
Process p becomes X-contaminated at time c in B with respect to ?B and F by
delivering a message m if and only if:
41
o+ there is a process q and a time c', such that (q,c') ?B (bc(m),ts(m)) and q
is not X-correct until c' in B with respect to ?B and F, and
o+ for all messages m1 that p delivers before m, for all processes q and times c1,
such that (q,c') ?+B (bc(m),ts(m)), q is X-correct until c' in B with respect to
?B andF.
Process p is x-contaminated at time c in B with respect to ?B and F if and only if p
becomes X-contaminated at time c' < c in B with respect to ?B and F by delivering
a message m.
VBD-conta?inated
D-contaminated
Figure 3.8: The contamination hierarchy
Recall the X-correctness denotes a "generic" correctness condition. Thus, the
above definition leads to three forms of contamination--HD-contamination, VBD-contamination
and BD-contamination--Hcorresponding to our three definitions of correctness D-
correctness, VBD-correctness and BD-correctness. Figure 3.8 illustrates the relation-
ships between these three forms of contamination for any correct process p at any
time c in any broadcast history B with respect to any broadcast specification ?B
and any failure set F. This is called the contamination hierarchy.
Theorem 3.7 Let B = (p,S) ? 13 and F be such that ?B(B,F). For all correct
processes p, for all times c:
o+ If p is D-contaminated at c, then p is VBD-contaminated at c.
o+ p is VBD-contaminated at c if and only if p is BD-contaminated at c.
Proof: Suppose a correct process p is D-contaminated at some time c in B. From
the definition of D-contaminated, there is a process q that is not D-correct until time
c', and (q,c?) ?B (p,c). Since q is not D-correct until ??, q is not VBD-correct until
c'. Hence, p is also VBD-contaminated at time c in B.
42
Suppose a correct process p is VBD-contaminated at some time c in B. Erom an
argument similar to the one above, p is BD-contaminated at time c in
Suppose a correct process p is BD-contaminated at some time c in B. Thus, there
is a faulty process q $ p that is not BD-correct until some time c', and (q, c') HB
Let c" be the largest time such that (q, c") ?B (p, c). Since q $ p, Lemma 2.3
implies that some process r $ q delivers a message of the form (q, c",--H). By choice
of c", c" ? c' and hence q is not BD-correct until c1' in B. Theorem 3.4 implies that
q is not VBD-correct until time c", and hence p is VBD-contaminated at time c in B. 0
3.5.1 Contamination in an application history
We now define x-contamination-free processes in application histories.
Definition Let A Ei A, B = (P, ?) be the broadcast history of A, and F be such that
?B(B, F). Process p is x-contamination-free in A with respect to ?B and F if and
only if for all c, p is not x-contaminated at time c in B with respect to ?B and F.
Note that a faulty process is, by definition, x-contamination?free.
3.6 Consistent and contamination-free histories
This section defines x-consistent and x-contaminationjree histories, in terms of x-
consistent and x-contamination-free processes.
Definition Let A Ei A, B = (p, S) be the broadcast history of A, and F be such
that ?B(B, F). A is x-consistent (x-contamination-free) with respect to ?B and F
if and only if for all processes p, p is X-consistent (x-contamination-free) in A with
respect to ?B and F.
If A is not X-consistent, we say A is x-inconsistent.
Finally, we use the following definition:
Definition Protocol II solves ?A using ?B assumtng X-consistent behavior if and
only if:
VH = (ll,Q?,? ??Y:
[VF: ?B((P, ?), F)
(a,P,?) is X-consistent with respect to ?B and F
??((a,P, ?), F)]
and
The word "x-consistent" can be replaced with "x-contamination-free" to yield
an analogous definition for x-contamination-free behavior.
Chapter 4
Solving Correct Restricted
Problems
In Chapter 3, we defined when a faulty process is inconsistent and when a correct
process is contaminated. We showed that preventing inconsistency also prevents
contamination; however, preventing contamination does not prevent inconsistency.
In this chapter, we characterize a class of problems, called correctrestrictedprob-
lems (cr-problems), which are specified by imposing restrictions on the behavior of
correct processes, but not on the behavior of faulty processes. We prove the following
"substitution theorem" which shows that for cr-problems, the prevention of contam-
ination is "as good" as the prevention of inconsistency. To solve a cr-problem, an
application protocol can be designed with the simplifying assumption that it will use
a broadcast protocol that prevents both inconsistency and contamination. The appli-
cation protocol remains correct even if it uses a broadcast protocol that only prevents
contamination. Since the prevention of contamination is often less expensive than
the prevention of both inconsistency and contamination, such a substitution often
improves the performance of the application.
The above "substitution theorem" is valid for broadcast specifications that have
the choice property. Intuitively, if processes communicate via such a broadcast,
then any process may choose to stop broadcasting messages at any time, or any
faultyprocess may choose to stop delivering messages at any time. Most broadcasts
considered in the literature, such as reliable broadcast, causal broadcast and atomic
broadcast have this property.
The "substitution theorem" is valid for any form of inconsistency that is due to a
"violation" of a "correctness condition," providing the latter satisfies the SD-closed
property. We formally define this property, and prove that D-correctness, vBD-
correctness and BD-correctness all have the SD-closed property. Thus, our results
43
44
are valid for the three types of inconsistency introduced in Chapter 3.
Some of the proofs presented in this chapter are given using the maximal caisal
prefix of a history. Let A be a well-formed application history, and F be a failure set.
Since all communication between processes is by message broadcasts and deliveries,
only a portion of the history is "visible" to (i.e., lies in the causal past of) the correct
processes. The application history that is derived in the natural manner from this
visible portion of A is called the maximal causal prefix of A with respect to F.
Intuitively, we show that if A is contamination-free with respect to ?B and F,
then the maximal causal prefix of A with respect to F is consistent with respect to ?B
and F. This result is used to prove the main result of this chapter, the "substitution"
theorem: If an application protocol solves a cr-problem ?A using a broadcast that
satisfies ?B and assumes consistent behavior, then the protocol solves ?A using ?B
assuming contamination-free behavior.
4.1 The choice property of broadcast
specifications
The results presented in this chapter are valid for those broadcast specifications that
have the choice property.
Let ?B be any broadcast specification. Informally, ?B has the choice property if
for all B ? 13, for all F such that ?B(B, F):
o+ For all processes p and all times c, p unilaterally decides whether to broadcast
a message at c.
More formally, let p be any process and c be any time such that p broadcasts
a message m at c in B. If B' is the broadcast history that is identical to B,
except that p does not broadcast m, and no process delivers m, then B1is
well-formed and ?B(B', F) is satisfied.
o+ For all fault? processes p and all times c, p unilaterally decides to stop delivering
messages after any time c.
More formally, for all faulty processes p (p E F) and times c, if B' is the
broadcast history that is identical to B, except that p does not deliver any
messages after time c in B', then B' is well-formed and ?B(B', F) is satisfied.
The formal definition of the choice property uses the Subtract and Deafen operations,
defined below using the broadcast histories B = (P, ?) and B' = (p', S1).
= Subtract(B, M): If M is a set of messages, B' is created by removing
from B the broadcast and deliveries of all messages in the set M.
Formally, B' = Subtract(B, M) if and only if:
45
o+ S'(p,c) =			mE S(p,c) and m ? M? and
o+ P(p,c)?M ? P'(p,c)=P(p,c),andP(p,c)EM ?
= Deafen(B, X): Suppose X is a set of tuples of the form (p, c), such that
(p, c) C X . (p, c' $ c) ? X. B' is identical to B, except that for all
(p, c) E X, p does not deliver any messages after c in B'.
Formally, B' = Deafen(B, X) if and only if for all p:
o+ (p,--H)?X Bp' = Bp and
o+ ?? (p,c)EX ? ppI=ppand?=C&p?
Definition Broadcast specification ?B has the choice property iffor all B = (p, ?) E
13, for all F such that ??(B,F):
o+ For all m, if B' = Subtract(B, fm?), then B' E 13 and ?B(B', F), and
o+ For all p E F, for all c, if B' = Deafen(B,?(p,c)?), then B' E 13 and
?B(B', F).
Lemmas 4.1 and 4.2 are generalizations of the above definition; the proofs are
straightforward and hence omitted.
Lemma 4.1 Suppose ?B has the choice property. Let B E 13 and F be such that
?3(B,F). Let M be any set of messa9es. If B' = Subtract(B,M), then B' E 13 and
?D(B', F).
Lemma 4.2 Suppose ?B has the choice property. Let B E 13 and F be such that
?B(B, F). Let X be a set of tuples of the form (process, time), such that if (p, c)
ts zn X, then p is faulty, and (p,c' $ c) is not in x; i.e., X C ? x I, such that if
(p,c) E X thenpE F, and (p,c1 $ c) ?X.
If B' = Deafen(B,X) then B' E 13 and ?B(B',F).
Assumption All broadcast specifications we consider have the choice property.
Most broadcasts considered in the literature, such as reliable broadcast, causal
broadcast, atomic broadcast etc., have the choice property. In Chapter 5, we prove
that atomic broadcast has this property.
4.2 The SD-closed property
We will show that for cr-problems, the prevention of inconsistency is "as good as" the
prevention of contamination. This result is valid for the inconsistency that results
46
from the "violation" of any correctness condition that has the SD-closed property.
We formally define this property, and prove that D-correctness, VBD-correctness and
BD-correctness all have the SD-closed property. Thus, the "substitution theorem" is
valid for the three types of inconsistency defined in Chapter 3, and the corresponding
forms of contamination.
Suppose ?B has the choice property. Let B = (p, ?) E t3 and F be such that
?B(B, F). Let process p and time c be such that p is X-correct until time c in B
with respect to ?B and F.
Informally, X-correctness is subtract closed (5-closed) with respect to ?B if the
following condition holds: for all faulty processes r and times c', if B' is the
broadcast history derived from B by "subtracting" all the broadcasts r makes
after ??, then p remains x-correct until c in B'.
Formally, for all r E F, for all c', if M = fp(r, c") c" > c'?, then p is X-correct
until c in Subtract(B, M) with respect to ?B and F.
Informally, x-correctness is deafen closed (D-closed) with respect to ?B if the
following condition holds. Suppose that either p is not faulty, or p broadcasts
some message at or after time c in B. For all faulty processes r, for all times
c' at or after r's last broadcast in B, if B' is a broadcast history derived from
B by "deafening" r at c', then p remains x-correct until c in B'.
Formally, if either p ? F or c ? LB(B,p), then for all r ? F, for all c' >
LB(B, r), p is X-correct until c in Deafen(B, f(r, c')?) with respect to ?B and
Suppose however that p is faulty and p does not broadcast any message in B
at or after time c. In this case, we can choose r = p and c' < c (in the above
definition of B'), to give a broadcast history B' in which p is "deafened" at
some time before c. In such a case, it is easy to construct a scenario in which
p is BD-correct until time c in B with respect to ?B and F, but p is not even
D-correct until time c in B' with respect to ?B and F.
Definition Suppose ?B has the choice property. X-correctness is subtract-deafen
closed (SD-closed) with respect to ?B if and only if X-correctness is both 5-closed
and D-closed with respect to ?B
Suppose X-correctness is SD-closed with respect to ?B Let p be a process that
is x-correct until some time c in a broadcast history B with respect to ?B and
some failure set F. Informally, Lemma 4.3 shows that the broadcast history B',
obtained by subtracting a suffix of messages broadcast by each one of a subset of
faulty processes, is such that p is x-correct until c in B'
47
Note that Lemma 4.3 is true even if X-correctness is only 5-closed, and not SD-
closed. Thus, this lemma is a generalization of the definition of a S-closed correctness
condition.
Lemma 4.3 Suppose X-correctness is SD-closed with respect to ?B Let B --H
(P, S) ? 13 and F be such that ?B(B, F). Let p be a process that is X-correct until
some time c in B with respect to ?B and F. Let X C ? x I, such that if (q, c') e X
then q ? F, and(q,c" $ c') ?X.
If M = fP(q, c") (q, c') Ei X and c" > c1i, then p is X-correct until c in
Subtract(B, M) with respect to ?B and F.
Proof: Let M = ?P(q,c11) (q,c') Ei X and c11 > c1i. The proof that p is
X-correct until c in Subtract(B, M) with respect to ?B and F is by induction on the
cardinality of the set X
Basis: Suppose ;xi = 0. Thus, M = ? and B = Subtract(B, M). Since p is X-correct
until c in B (by the lemma's hypothesis), the lemma is trivially true.
Induction Hypothesis: Suppose the lemma is true for all IX < k, for some k > 1.
Induction:
and B' be
Suppose XI = k. Let (r, c') be any tuple in X, M' = ?P(r, c") c1, >
the broadcast history Subtract(B, M'). By the lemma's hypothesis, x-
correctness is SD-closed, ?B(B, F), and r ? F. By the definition of x-correctness
being SD-closed, p is x-correct until time c in B'. Since (by assumption) ?B has
the choice property, Lemma 4.1 implies that B' ? 13 and ?B(B', F).
Let X' = X --H ?(r,c')i, and M" = ??(q,?) (q,c") Ei X' and c* > c11i. Since
? 13, ?B(B', F), and p is X-correct until c in B', we conclude from the induction
hypothesis that p is X-correct untiL c in Subtract(B', M'1).
Since M = M'U M" and B' = Subtract(B, M'), it is easy to show that Subtract(B, M) =
Subtract(B', M11). Sincep is X-correct until c in the broadcast history Subtract(B', M11),
we conclude that p is X-correct until c in Subtract(B, M) with respect to ?B and F. ?
Lemma 4.4 generalizes the definition of a D-closed correctness condition (just as
the previous lemma generalized the definition of 5-closed correctness condition).
Lemma 4.4 Suppose X-correctness is SD-closed with respect to ?B Let B ? 13
and F be such that ?B(B, F). Let p be a process that is x-correct until some time
c in B. Let X C ? x I, such that if(q,c') ? X then q ? F c? > LB(B,q) and
(q, c" $ c') ?
If either p ? F or c ? LB(B, p), then p is X-correct until c in Deafen(B, X) with
respect to ?B and F.
48
Proof: Suppose that either p ? F or c < LB(B, p). The proof that p is X-correct
until c in Deafen(B, X) with respect to ?B and F is by induction on the cardinality
of the set X.
Basis: Suppose X = 0. Thus, B = Deafen(B, M), and the lemma is trivially true.
Induction Hypothesis: Suppose the lemma is true for all X < k, for some k > 1.
Induction: Suppose XI = k. Let (r, c') be any tuple in X, and B' be the broadcast
history Deafen(B, f(r, c')?). By the lemma's hypothesis, X-correctness is SD-closed,
either p ? F or c ? LB(B,p), ?B(B,F), r ? F and c' > LB(B,r). By the
definition of X-correctness being SD-closed, p is X-correct until time c in B'. Since
(by assumption) ?B has the choice property, Lemma 4.2 implies that B' ? 13 and
?B(B', F).
Since the Deafen operation does not affect any broadcasts (i.e., P' = P), for all
q, LB(B,q) = LB(B',q). Therefore, we conclude that eitherp ? For c ? LB(B',p)
(since p ? F or c < LB(B,p)) and that for all tuples (q,c") ? X : c" > LB(B',q)
(since c1, > LB(B, q)).
Let X' = X --H f(r,c')l. Since B' ? 13, ?B(B',F), and p is X-correct until
c in B', we conclude from the induction hypothesis that p is X-correct until c in
Deafen(B', X').
Since X = X'U f(r, c')l and B' = Deafen(B, ((r, c')?), it is easy to show that
Deafen(B, X) = Deafen(B', X'). Since p is X-correct until c in Deafen(B', X'), we
conclude that p is X-correct until c in Deafen(B, X) with respect to ?B and F. 0
We now show that VBD-correctness is SD-closed with respect to any broadcast
specification that has the choice property. The proofs that D-correctness and BD-
correctness are also SD-closed are similar to the one presented below, and hence are
omitted.
Theorem 4.5 Suppose ?B has the choice property. VBD-correctness is SD-closed
with respect to ?B
Proof: Let B = (p, S) ?13 and F be such that ?B(B, F). Suppose a process
p is VBD-correct until some time c in B. Let B* = (p*, ?*) be a VBD-extension of
(B,p,c). Thus, B* ? 13, ??(B*,F--Hp), and:
o+ B-* = B??and BF*?p =? BF?p
F and * ffiffi P?I? for some c* > max(c --H A, Visible(B,p, c)) (4.1)
o+ =c
We first show that VBD-correctness is 5-closed with respect to ?B (Claim 1), and
then show that VBD-correctness is D-closed with respect to ?B (Claim 2).
49
Cla?m 1: For all r ? F, for all c', if M = fP(r,c") I c" > c'?, then p is X-correct
until c in Subtract(B,M) with respect to ?B and F.
Proofofclaim: Let B' = (p',S') be the broadcast history Subtract(B,M). Since ?B
has the choice property, Lemma 4.1 implies that B' ? B and ?B(B', F) is satisfied.
Let B" = (P", ?`) = Subtract(B*,M). Since ?B(B*, F --H p), Lemma 4.1 implies that
? B and ?B(B11, F --H p) is satisfied.
It is straightforward to combine equation 4.1 with the expressions for B'in terms
of B, and for B" in terms of B* (given by the Subtractoperation), to show that B"
is a VED-extension of (B', p,c) with respect to ?B and F. Since B' is the broadcast
history Subtract(B,F), we conclude that p is VBD-correct until c in Subtract(B,F),
thus proving the claim.
B			VBD?extension
Subtract
VBD-extension
Subtract
Figure 4.1: Illustrating the proof of Theorem 4.5
Figure 4.1 illustrates the structure of the proof; e.g., ,the arrow from B to B*
indicates that B* is a VBD-extension of (I?, p,c) with respect to lSB and F.
The proof that B" is a VBD-extension of (B', p,c) with respect to ?B and F is a
case analysis. Since the proofs of the cases are similar, to avoid repetition, we only
present the most interesting case: when p = r, we show that Pp11 = PpI? for some
c* > max(c--H A, Visible(B',p,c)).
Suppose p = r. Since B' is obtained from B by removing the broadcasts made
by p at or after c', and also by removing the corresponding deliveries, it is easy to
show that VisThle(B*,p,c) >--H Visible(B',p,c).
Note that ? p?j? for some c* > max(c --H A, Visible(B,p,c)) (by equation
4.1), and Pp' =?` PpI? and Pp" =?` P*pI? (by assumption, since p =
o+ If c* ? c' then Pp' ffiffip? and Pp11 ffiffi P*p, and hence we get p;' =c? P'pI?
o+ If c* > c', then =? ??; since for all c'1 > c', P"(p,c") = ? and P'(p,c") =
we get Pp11 = P'p
Since Visible(B,p,*c) > Visible(B',p,c), and c* > max(c--H A, Visible(B',p,c)), we
conclude that Pp11 =? P'pI? for c* > max(c--H A, Visible(B',p,c)).
50
Claim 2: If either p ? F or c ? LB(B,p), then for all r ? F, for all c' > LB(B,r),
p is VBD-correct until time c in Deafen(B, ?(r, c')?).
Proof of claim: Suppose p ? F or c < LB(B, p). Let r be any faulty process, and
let c' > LB(B, r). Let B' = (p', S') be the broadcast history Deafen(B, f(r, c')i).
Since ?B has the choice property, Lemma 4.2 implies that B' E `3 and ?B(B', F) is
satisfied.
The proof is a case analysis.
Case: p = r:			Since r ? F, p ? F and c' > LB(B,p). Since c <
c < c'. B' = Deafen(B,f(p,c')?) implies that p' = p, Sp?' = Sp, and			=? SpI?
furthermore, c ? c' implies that =? ?. From this and from equation 4.1, we
conclude that B* is a VBD?extension of (B', p, c) with respect to ?B and F. Since
= Deafen(B, ?(p, c')i) and p = r, p is VBD-correct until c in Deafen(B, f(r, c')J).
Case:
p r: Let B" = (P", S") = Deafen(B*, f(r, c')?). Since ?B has the choice
property, and ?B(B*, F --H p), Lemma 4.2 implies that B" ? `3and ?B(B?, F --H p) is
satisfied.
From the definition of Deafen:
o+ Bw" = B-? and Br' = Br?.
o+ Pr11=P; and Pr'=Pr and ?=Cls*rI? and SrI=CISrI?
It is straightforward to combine the above with the equation 4.1, to show that B"
is a VBD-extension of (B',p, c) with respect to ?B and F. Since B' = Deafen(B, ((r, c')?),
p is VBD-correct until c in Deafen(B, t(r, c')?), thus proving the claim.
The proof that B" is a VBD-extension of (B', p, c) with respect to ?B and F is
further divided into several parts. For brevity, we only present the most interesting
part: we show that q =? ?.
Note that ? =? 6?r (equation 4.1, since r e F); thus, the definition of Deafen
implies that ? =?` ?rI# and ? =?` SrI?.
o+ If c < c' then S?? =? S*r and ?r' =? Sr, and we conclude that Sr? =? S'r
o+ If c > c?, then Sr* =? 5r; since for all c" > c', S"(r,c") = ? and S'(r,c") = ?, we
get ?r" = S'r. Thus, we conclude that ?r? =? S'r
0
51
4.3 The maximal causal prefix
Let H be a well-formed history, and F be a failure set. Since all communication
between processes is by message broadcasts and deliveries, only a portion of the
history is "visible" to (i.e., lies in the causal past of) the correct processes (i.e., the
processes not in F). The history that is derived in the natural manner from this
visible portion of H is called the maximal catsal prefix of H with respect to F and
denoted McP(H, F) (see Figure 4.2).
History H and failure set F = fr, s?. The maximal causal prefix of H with respect
to F is derived from the part of H to the left and above the dashed line.
deliver(m')
p
q
r
s
deliver(m)
deliver(m')
broad?ast(m')
broadcast(m")
br?adcast(m)			deliver(m")
Figure 4.2: Illustrating the maximal causal prefix
Let H be a well-formed history, B be the broadcast history of H, and F be a
failure set such that ?B(B, F) is satisfied. Suppose H? = McP(H, F), and B? is
the broadcast history of H?. We show that ?B(Bm, F) is also satisfied (Lemma
4.9). Furthermore, if p is X-correct until some time c in B with respect to ?B and F,
and the tuple (p, c) lies in the causal past of any correct process, then p is X-correct
until c in B? with respect to ?B and F. This result is the basis of the main result
of this chapter, proven in Section 4.5.
For all processes p, let T'p(B,F) denote the greatest time such that (p, Fp(B,F))
52
lies in the causal past of the correct processes in B. For convenience, the formal
definition of Fp(B,F) uses a function PVC (for "past visible to correct processes").
Definition Let B = (p, ?) be a broadcast history, and F be a failure set. For all
processes p:
PVCp(B,F)
Fp(B,F)
fc 3q ? F, ?c' : (p,c) HB
max(PVCp(B,F)) if PVCp(B,F) is finite and nonempty
0 if PVCp(B,F) is empty
otherwise
00
Recall that for all broadcast histories B, for all processes p, for all times c,
(p, c) HB (p, c). Therefore, for all B, for all failure sets F, for all correct processes p
? F), PVCp(B,F)I = oo and Fp(B,F) = 00.
To simplify the notation, we write Pp to denote Fp(B,F), F?m to denote Fp(B'n,F),
and so on.
The following lemmas are straightforward consequences of the above definition.
Lemma 4.6 Let B = (p, S) be a broadcast history and F be a failure set.
o+ For all p, if p is correct (p ? F), then Fp = 00.
o+ For all p, if I'p $ 00, then p is faulty (p ? F) and p broadcasts a message at
time Fp in B (i.e., P(p,F?) $ ?).
o+ For all p, if p is faulty (p ? F) and f'p = 00, then for all c, there is a correct
process q and a time ?? such that (p, c) ?B (q, c'). Furthermore, p broadcasts
an infinite number of messages in B (i.e., LB(B,p) = 00).
Lemma 4.7 Let B be a broadcast history and F be a failure set. For allp, if I'p $ 00,
then:
o+ ForallqsuchthatFq$00, (p,Fp+l)#?(q,Fq)
o+ For all q such that T'q = 00, for all c, (p,Fp + 1) #+B (q,c).
Proof: Suppose p is a process such that l'p $ 00. We prove the first part of
the lemma below. The proof of the second part of the lemma is similar, and hence
omitted.
Suppose for contradiction that the first part of the lemma is falsei. e., ?q such
that Fq $ 00, and (p,F? + 1) ?B (q,Fq). By definition of Pq, there is a correct
process r and time c, such that (q, Pq) ?B (r, c). Therefore, (p, Fp + 1) ?B (r, c),
contradicting the definition of Pp.			E
53
4.3.1 Notation
LetA=(a,P,?,B=(P,S),A'=(a',P',?)andB'=(P',?). LetXCPandletc
be any time. If for all processes p E X, for all c' < c p is in the same state in A' at c'
as it is in A at c', we write ax' =? ax. Formally: a'x =? ax ? Vp E X, (Vc', 1 <
c: a'(p,c1) =
If ax' =? ax and the processes in X are in the halt state at time c +1 in A', we
write ax' =? axI. Formally: ax' =? axil ? ax' =? ax and (Vp c X,Vc'> c:
a(p,c') = I).
We write A'x =? Ax if and only if Bx' =? Bx A ax' =? ax. We write Ax' =? AxI?
if?fl?onlyffB??=C??? and ax?=caxII.
4.3.2 Defining the maximal causal prefix
The maximal causal prefix of a history is formally defined below.
Definition Let H = (ll,a,P,S) and H? = (11, am, pm, Sin) be histories with appli-
cation and broadcast histories of A and B, and A? and Bin respectively. Let F be a
failure set.
B? is the maximal causal prefix of B with respect to F, denoted McP(B, F), if
for all p, Pp=oo ? B;??=B? and Fp$oo ? B;??PBp;?
A? is the maximal causal prefix of A with respect to F, denoted MoP(A, F), if
for all p, T'p = 00 ? Apin = Ap and Fp $ Co Apin Fp ApI?.
Hin is the maximal causal prefix of H with respect to F, denoted MoP(H, F), if
Am = McP(A,F).
The following lemma is a consequence of the above definition of maximal causal
prefix.
Lemma 4.8 Let H be a well-formed history, and A and B be the application and
broadcast history of H respectively. Let F be a failure set. Let Hin = McP(H, F)
and A? and B? be the application and broadcast history of Hm respectively.
o+ H?, Am and Bin are well-formed.
o+ AinF =
o+ For alip, if Fp $ Co, then Halt(Am,p) = Fpin = Fp.
Proof: The well-formedness of H?, Am and B? follow directly from the definition
of well-formed histories.
Let p be any correct process (i.e., p ? F). Thus Fp = 00, and by the definition
of maximal causal prefix, Apin = Ap. This proves the second part of the lemma.
54
The third part of the lemma follows from the definitions of Fp and F?rn, and the
definition of the maximal causal prefix.			0
Let B e 13 and F be such that ?B(B, F). We show below that B?, the maximal
causal prefix of B with respect to F, can be derived from B by the following Subtract
and Deafen operations:
o+ For all (faulty) processes p with Fp $ 00, the Subtract operation removes all
the broadcasts p makes after time l'p, and the deliveries of these messages by
the other processes.
o+ For all (faulty) processes p with Fp $ 00, the Deafen operation removes all of
p's deliveries after time Fp.
We show that B? is a well-formed broadcast history, and that ?B(Bm, F) is satisfied.
Lemma 4.9 Let B (p, S) ? 13 and F be such that ?B(B, F). Let M = fP(p, c) Fp $
00 and c > Fpl andX = f(p,Fp) l'p $ 00?. If B+ --H Subtract(B,M) and
= Deafen(B+,X), then:
o+ Ei 13, ?B(B*, F), and B* is the maximal causal prefix of B with respect to
F (i.e.,B* = McP(B,F)).
o+ For all q Ei F : Fq = LB(B+,q).
Proof: Let B+ = (P+, ?+) and B* = (p*, ?) be as in the statement of the
lemma. Consider the first part of the lemma. To prove that i?* = McP(B, F), we
show that for all q, Fq =00 ?			= B, and Pq $ 00			Fj Bqj?.
Let q be any process. The proof is a case analysis.
Case: l'q = 00: Since Fq = 00 and the Subtract operation only affects the
broadcasts of all processes r with Pr $ 00, Pq+ = Pq. For all m ? M, since m was
broadcast by some process p after time Fp and Fp $ 00, Lemma 4.7 implies that
q never delivers m in B. Thus, Sq+ = ?, and we get that Bq+ = Bq. Since the
Deafen operation only affects all processes p with Fp $ 00, q is unaffected and we
get Bq* = Bq+ Thus, we conclude that = Bq.
Case: Fq $ 00: Since l'q $ 00, the Subtract operation removes all broadcasts by
q after time l'q, and hence Pq+ r? . For all m ? M, since m was broadcast by
--H pqI?
some process p after time Fp and Fp $ 00, Lemma 4.7 implies that q does not deliver
m at or before Fq in B. Thus, Sq+ r4 ? Since the Deafen operation does not affect
broadcasts, and removes all deliveries by q after time r'q? p; = Pq+ and S* rq ?+ ?
Combining the above, we conclude that B_ ?
55
Thus, we have shown that B* = MoP(B, F). We now show that B* ? `3 and
?B(B*, F).
By choice of M, for all q such that a message of the form (q,--H, ) ?
Fq $ 00, and hence (by Lemma 4.6) q is faulty (i.e., q ? F). Since ?B(B, F) and
= Subtract(B, M), Lemma 4.1 implies that B+ e `3 and ?B(B+, F). By choice
of X, for all tuples (q,--H) ? X, Fq $ 00, and hence (by Lemma 4.6) q is faulty.
Since ?B(B+, F) and B* = Deafen(B+, X), Lemma 4.2 implies that B* ? `3 and
?B(B*, F).
Thus, we have shown that B* = McP(B, F), B* ? `3and ?B(B*, F), completing
the proof of the first part of the lemma.
Let q be any process in F. To prove the second part of the lemma, we show
Fq = LB(B+,q).
Case: T'q = 00: By Lemma 4.6, LB(B,q) = 00; ?.e., q broadcasts an infinite
number of messages in B. Since 1?q+ = Bq, LB(B+,q) =00.
Case:			Fq $ 00:			By Lemma 4.6, P(q,Fq) $ ?. Since Pq+ Fj Pq1? , for all
c> Fq: P+(q,c) = ? That is, LB(B+,q) = Fg.			0
The following lemma relates the SD-closed property and maximal causal prefixes.
Lemma 4.10 Suppose X-correctness is SD-closed with respect to ?B Let B ? `3
and F be such that ?B(B, F). For all processes p, for all times c < Fp, if p `s
x-correct until some time c in B with respect to ?B and F, then p is X-correct until
time c in McP(B, F) with respect to ?B and F
Proof: Suppose a process p is X-correct until some time c < Fp in B. Let
M = ?P(q, c1) Fq $ 00 and c1 > FqJ (as in the hypothesis of Lemma 4.9). Thus, for
all messages (q,--H,--H) ? M, Pq $ 00, and (from Lemma 4.6) q ?
Let B+ = Subtract(B, M). Lemma 4.1 and Lemma 4.3 imply that B+ e'3 and
?B(B+F) is satisfied, and that p is x-correct until c in B+ with respect to ?B and
Lemma 4.9 implies that if p ? F, then Fp = LB(B+,p). Since c < Fp, we
conclude that either p ? F or c < LB(B+,p).
Let X = ?(q, l'q) l'q $ 00? (as in the hypothesis of Lemma 4.9). Thus, for all
tuples (q, ) ? X, Fq $ 00, and (from Lemma 4.6) q E F
Let B* = Deafen(B+,X). Since B+ ? `3, ?B(B+F), and p is X-correct until c
in B+, and either p ? F or c < LB(B+,p), Lemma 4.4 implies that p is X-correct
until c in B* with respect to ?B and F.
56
By construction of B*, from
Since p is X-correct until c in B*
is X-correct until c in McP(B, F).
4.4 The history hierarchy
Lemma 4.9 we conclude that B* = McP(B, F).
with respect to ?B and F, we have shown that p
0
Let ?B be a broadcast specification. Let A be a well-formed application history, B
be the broadcast history of A, and F be such that ?B(B, F) is true.
Suppose A is BD-consistent with respect to ?B and F. From the correctness
hierarchy (Chapter 3), A is VBD-consistent with respect to ?B and F. This is
illustrated in Figure 4.3 by the arrow from the box marked "BD-consistent" to the
box marked "VBD-consistent." From the contamination hierarchy (Chapter 3), A
is BD-contamination-free with respect to ?B and F. This is illustrated in Figure
4.3 by the arrow from the box marked "BD-consistent" to the box marked "BD-
contamination-free."
Suppose A is BD-contamination-free with respect to ?B and F. Lemma 4.11
(given below) proves that the maximal causal prefix of A with respect to F is BD-
consistent with respect to ?B and F.
Figure 4.3 illustrates the rest of the history hierarchy.
Maximal causal prefix
HD-contarnination-free
VBD?consistent
Maximalcausalprefix
VBD-con?amination-fre
Maximal causal prefix
D-con?amination-free
Figure 4.3: The history hierarchy
Lemma 4.11 Suppose x-correctness is SD-closed with respect to ?B Let A E A,
B = (p,?) be the broadcast history of A, and F be such that ?B(B,F). If A is x-
contamination-free with respect to ?B and F, then McP(A, F) is X-consistent with
respect to ?B and F
57
Proof: Suppose A is x-contamination-free with respect to ?B and F. Let
Am McP(A, F), and let Bm = (P?, S?) be the broadcast history of Am. Since A is
well-formed, Lemma 4.8 implies that Am and B? are well-formed. Since ?B(B, F),
Lemma 4.9 implies that ?B(Bm, F) is also true.
Let p be any process and c be any time < LVT(Am,p). To show that Am is
X-consistent with respect to ?B and F, we must show that p is X-correct until c in
B? with respect to ?B and F.
Note that c < 1'p This is clearly the case if Fp = 00. If l'p $ 00, then Lemma
4.8 implies that Halt(Am,p) = Fp. Since LVT(Am,p) < Halt(Am,p) and c _
LVT(Am,p), c < T'p
Claim: Process p is X-correct until c in B with respect to ?B and F.
Proof of claim: The proof is a case analysis.
Case: Fp $ 00: Since c < Pp, to prove the claim it is enough to show that p
is X-correct until Fp in B with respect to ?B and F. By definition of I'p, for some
process q ? F and some time c', (p, Fp) ?B (q, c'). Since A is x-contamination-free
with respect to ?B and F, p is X-correct until t'p in B.
Case: Fp = 00: Since Fp = 00, Lemma 4.6 implies that for some process q ?
and some time c?, (p, c) (q, c'). As above, since A is x-contamination-free with
respect to ?B and F, p is X-correct until c in B.
Since X-correctness is SD-closed with respect to ?B, ?B(B, F) and p is X-correct
until time c in B with respect to ?B and F, we conclude from Lemma 4.10 that p is
X-correct until time c in B?.			0
4.5 Correct restricted problems
We now characterize a class of problems, called correct restricted problems (cr-
problems), for which the prevention of contamination is "as good" as the prevention
of inconsistency. Intuitively, the specifications of such problems ignore the behav-
ior of faulty processes, and concentrate only on the behavior of correct processes.
Formally:
Definition ?A is a correct restricted specification (cr-specification) if and only if:
VA ? A, VA' ? A: VF, [?A(A, F) and A# = AF ? ?A(A', F)]
58
We now prove the main result of this chapter. Suppose X-correctness is SD-
closed with respect to ?B, a broadcast specification with the choice property. If an
application protocol II solves a cr-specification ?A using ?B assuming X-consistent
behavior, then II solves ?A using ?B assuming x-contamination-free behavior.
In other words, II can be designed with the simplifying assumption that both
inconsistency and contamination will be prevented. However, II will solve ?A using
even if only contamination is prevented. Since the prevention of contamination
is often less expensive than the prevention of both inconsistency and contamination,
such a "substitution" often improves the performance of the application.
Theorem 4.12 Suppose X-correctness is SD-closed with respect to ?B If II solves
a cr-specification ?A with ?B as5v??fl9 X-consistent behavior, then II solves ?A with
?B assuming x-contamination-free behavior.
Proof: Suppose II solves ?A with ?B assuming X-consistent behavior. Let
H = (H,a,P,?) be a well-formed history, A = (Qp,S), B = (P,S) and let F be any
failure set such that ?B(B, F). Suppose A is x-contamination-free with respect to
?B and F. To prove the lemma, we show that ?A(A, F) is satisfied.
Let H? = (11, a?, ??, S?) = MoP(H, F), A',' = (a?, p?, sm) and B? = (p?, S?).
Lemma 4.8 implies that H? is well-formed. Lemma 4.9 implies that ?B(B?, F) is
satisfied. Since X-correctness is SD-closed with respect to ?B, and ?B(B, F) and A
is x-contamination-free with respect to ?B and F, Lemma 4.11 implies that Am is
X-consistent with respect to ?B and F.
Since H? is a well-formed history, and ?B(1??, F) is satisfied, and A? is x-
consistent with respect to ?B and F, and II solves ?A using ?B assuming x-
consistent behavior, ?A(A?, F) is satisfied. Lemma 4.8 implies that Aw = AF.
Since ?A is a cr-specification, ?A(A, F) is also satisfied. 0
Recall that the three correctness conditions we defined in Chapter 3 have the
SD-closed property. Recall also that a VBD-contamination-free history is equivalent
to a BD-contamination-free history. Therefore, if II is designed with the strong
assumption of BD-consistent behavior, then II works correctly when executed under
conditions that prevent VBD-contamination.
Corollary 4.13 Suppose X-correctness is SD-closed with respect to ?B If II solves
a cr-specification ?A with ?B assuming BD-consistent behavior, then II solves ?A
with ?B assuming VBD-contamination-free behavior.
Chapter 5
Inconsistency and Contamination
with Atomic Broadcast
In this chapter, we present definitions of inconsistency and contamination with
atomic broadcast and timed atomic broadcast. These definitions are derived from
the general definitions of inconsistency and contamination given in Chapter 3.
5.1 Atomic broadcast
Informally, with atomic broadcast, any process may broadcast a message at any time,
such that:
o+ All correct processes deliver the same messages in the same order.
o+ Every correct process delivers all the messages broadcast by all correct pro-
cesses.
Formally, atomic broadcast is defined to be the conjunction of the validity, agreement
and integrity properties (defined in Chapter 2 and repeated below) and the total order
property defined below. 1
o+ Validit?: If a correct process broadcasts a message m, then it eventually delivers
o+ Agreement: If a correct process delivers a message m, then all correct processes
eventually deliver m.
o+ integrity: For any message m, any process delivers m at most once, and only
if m was broadcast by some process.
1Recall that such properties are more formally expressed as predicates of broad-
cast histories and failure sets.
59
60
O Total order: If correct processes p and q deliver messages m and m', then p
delivers m before m1 if and only if q delivers m before m1.
Atomic broadcast is one the most powerful fault-tolerant broadcasts used for
communication. It is the basis for Lamport's state machine approach to fault toler-
ance [Sch86]. It is also used in the Isis system [BJ87] and the IBM Highly Available
System [Cri87].
5.1.1 Notation
Suppose B (P, ?) is a broadcast history. BCASTpC(B) denotes the set of messages
that process p broadcasts by time c in B; formally, BCASTpC(B) = Uc'<c p(p, c').
DLVDpC(B) denotes the sequence of messages that p delivers by time c (defined in Chap-
ter 2). DLVDpC(B) denotes the set of messages that p delivers by time c; DLVDpC(B) =
m ? DLVDpC(B)? The superscript c is omitted when referring to all the messages
broadcast or delivered during a complete history.
Lemma 5.1 Let B (P, ?) ? 13 and F be a failure set, such that B satisfies atomic
broadcast with respect to F.
o+ All correct processes deliver the same sequence of messages:
Vp ? F, Vq ? F: D?vDp(B)			DT+VDq(B).
o+ If a correct process broadcasts a message m, then all correct processes eventu-
ally deliver m:
Vp ? F, Vq ? F: BCASTp(B) C DLVDq(B)
Proof:			Directly from the definition of atomic broadcast.			E
Suppose B is a broadcast history and F is a failure set such that 13 satisfies
atomic broadcast with respect to F. DLv?(B) and DLVDF(13) respectively denote
the sequence of messages and the set of messages that the correct processes deliver
during B.
5.1.2 Consistency with atomic broadcast
We show that delivery consistency is ensured with atomic broadcast, if the follow-
ing condition is satisfied (Lemma 5.4): the sequence of messages delivered by every
process (correct or faulty) is a prefix of the sequence of messages delivered by the
correct processes during the entire execution. This "prefix" property, in conjunc-
tion with fifo message delivery order, is enough to ensure visible-broadcast/delivery
consistency (Lemma 5.7).
61
However, the "prefix" property is not enough to ensure broadcast/delivery con-
sistency. As was mentioned in Chapter 3, broadcast/delivery consistency can be
ensured only if the following (informal) condition holds: a process may only broad-
cast a message after all of its previous broadcasts are successfully completed. In
other words, broadcast/deliver consistency can only be ensured with atomic broad-
cast using a "blocking" protocol.
We begin by defining D-correctness, VBD-correctness and BD-correctness with
respect to atomic broadcast. We show that these correctness conditions lead to the
"prefix" definition of consistency that was mentioned earlier.
Delivery correctness and consistency
The following lemma describes when a process is D-correct with respect to atomic
broadcast.
Lemma 5.2 Let B = (p, ?) ? !3 and F be such that B satisfies atomic broadcast
with respect to F. Process p is D-correct until c with respect to atomic broadcast and
F if and only if:
1. The sequence of messages p delivers by time c is a prefix of the sequence of
messages the correct processes eventually deliver: D?vDpC(B) ? DT+vD?(B).
2. If p broadcasts a message m that some process q delivers by time c, then
all the correct processes eventually deliver m: if p = bc(m) and m E DLVDqC(B),
then m ? DLVDF(B)
The first condition (labeled 1) in Lemma 5.2 is the "prefix" property that was
already mentioned. The second condition (labeled 2) is more subtle. Suppose a
broadcast history B satisfies atomic broadcast with respect to a failure set F. Sup-
pose that a process p broadcasts a message m, a faulty process q delivers m by some
time c and no correct process delivers m. In such a scenario (which would not be
permitted by the second condition), p cannot be D-correct until c in B with respect
to atomic broadcast and F; in particular, there is no D-extension of (B, p, c).
To see this, suppose for contradiction that B' = (P1, S1) is a D-extension of (B,p, c)
with respect to atomic broadcast and F. Since, by definition, ??F' = SF and SF' =? SF,
the faulty process q delivers m by time c in B', and no correct process delivers m in
If p broadcasts m in B', then it is clear that B' does not satisfy atomic broadcast
with respect to F --H p; i.e., p and the processes in F cannot all be correct.
Suppose p does not broadcast m in B'. Since m was broadcast by p in B, m is
of the form (p, c, --H). Thus, by the well-formedness of ??, no process broadcasts
62
m in B'. Since q delivers m, we conclude that B' does not satisfy the integrity
property.
Thus, we have shown that ?? does not satisfy atomic broadcast with respect to
F --H p, contradicting the assumption that B1is a D-extension of (B, p, c) with respect
to atomic broadcast and F.
Note that the second condition in Lemma 5.2 is not required if the specification
of atomic broadcast is weakened to include only the validity, agreement and total
order properties.
We now present the proof of Lemma 5.2.
Proof of the "only if" assertion of Lemma 5.2: Suppose p is D-correct until
c in B with respect to atomic broadcast and F. If p is correct, then the lemma is
trivially true.
Suppose that p is faulty (p ? F). Let B' = (P', S') be a D-extension of (B, p, c)
with respect to atomic broadcast and F given by the definition of D-correctness; thus
= and ?F' =? SF, and B' satisfies atomic broadcast with respect to F --H
1. Let q be any correct process (q ? F). Since SLF = SF and q ? F, D?vDq(B') =
DLVDq(B). Since B' satisfies atomic broadcast with respect to F --H p, Lemma 5.1
implies that D?vDp(B') = D?vDq(Th); hence Drvwc(B?) ? D?vDq(B'). Since S?' =?
D?vDpC(B) = D?vDpC(BI) Thus, we conclude that D?vDpC(B) ? D?vDq(B). Since q is
correct, Lemma 5.1 implies that D?vDpC(B) ?
2. Suppose that for some process q, and for some message m that was broadcast by
m ? DLVDqC(B) We show that m E Di;VD?F(B).
Suppose q is correct (q ? F). Then, by Lemma 5.1, m ? DLVD?F(B).
Suppose q is faulty (q ? F). Since 5F' =? 5F and m E DLVDqC(B), m E DLVDqC(BI)
Hence, by Lemma 2.1, p broadcasts m in B'. Since B' satisfies atomic broadcast with
respect to F --H p, and m ? BCASTp(B'), Lemma 5.1 implies that m E DLvD?(B').
Since SF' 5F' we conclude that m ? DLVD?F(B)
This completes the proof of the first part of the lemma.
Proof of the "if" assertion of Lemma 5.2: If p is correct, then the lemma is
trivially true. Hence, suppose that p is faulty (p e
Suppose the following conditions are satisfied:
o+ D?vDpC(B) ?
o+ For all processes q, for all messages m, if p broadcasts m and m ? DLVDqC(B),
then m ? DLVDF(B)
63
We prove by construction that p is D-correct until c in B with respect to atomic
broadcast and F--Hin particular, we construct a broadcast history ?? = (?`, ??) such
that B1is a D-extension of (B, p, c) with respect to atomic broadcast and F. B' is
derived from B as follows:
o+ The correct processes broadcast and deliver the same messages at the same
time in both B and B'. Formally: BF' = B?F.
o+ The faulty processes, not including p, broadcast the same messages at the same
time in both B and B'. Formally: PF?-p = PF-p.
o+ Until time c, the faulty processes, not including p, deliver the same messages
at the same time in both B and B'. After time c in B', the faulty processes,
not including p, do not deliver any messages. Formally: SF'?, =? SF?pI?
o+ Process p broadcasts in B' exactly the subset of its broadcasts that were de-
livered by the correct processes in B.
Thus: Vc' : P'(p,c1) --H P(p,c) if P(p,c) ? DLVD?F(B)
otherwise
Until time c, process p delivers the same messages at the same time in both B
and B'; i.e., Sp' =? &p. Process p's deliveries after time c in B' are obtained by
the procedure described below.
Recall that by assumption Drvwc(B) ? ??vi>??(B). Therefore, there is some
sequence of messages S = (mi, m2, . ?i,?i+1? .? such that:
[Drvwc(B) . s =
Informally, for all messages m ? S, p delivers m after time ts(m) in B'; fur-
thermore, p delivers all the messages in S in sequence order. Formally, for all
rflj ? S, p delivers message m? at time ci in B', as defined below:
Process p delivers mi at time ts(mi) + 1, or at time c + 1, whichever is
later; i.e., c? = max(ts(mi) + 1,c + 1).
For all m?, i > 1, in S, p delivers m? at time ts(mj) + 1, or at time cj--Hi,
whichever is later; i.e., cj = max(ts(m?) + 1,ci--Hi).
Claim: BF' = BF, BF'?p =? BF?p, SpI=C &p, P'p c-=A Pr and Vc1>c--HA:P'(p,c')=
P(p,c') or
Proof of claim: Directly from the construction, BF' = BF, 1?F1-p =? BF?p, Sp' =?
and Vc' : ?`(p, c') = P(p, c?) or ?. Since A = oo for an untimed broadcast, it follows
64
trivially that ?IpC=A pp This proves the claim.
Claim: B' is well-formed.
Proof of claim: We show below that B' satisfies the three properties required of a
well-formed history (as defined in Chapter 2).
1. It is clear that P' is a history broadcast function and S' is a history delivery
function.
2. Suppose for some process q, time c', and message m $ ?, m E S'(q, c'). We prove
that ts(m) < c'.
Suppose q $ p. Since SLF = ?F and ?F'_ =? 8F--HpI? , and m ? ?(q,c'), either
q ? F or c' < c. In either case, m ? 6(q, c'). Since B is well-formed, ts(m) < c'.
Suppose q = p and c' <c. Since =? &p (by construction), mE S(q, c1). Since B
is well-formed, ts(m) <
Suppose q = p and c' > c. From the construction of 1?', if m E S(p, c1) then
c1 > ts(m) + 1; i.e., ts(m) <
3. Consider any message m $ ? broadcast by any process q at any time c in
We prove that m is of the form (q,c',--H).
From the construction, if P1(q, c1) $ ?, then P'(q, c') = P(q, c'). Since B is well-
formed and m $ ?, m = (q, c', ).
Thus, B' is well-formed.			0
Claim: B' satisfies atomic broadcast with respect to F --H
Proof of claim: We show that B' satisfies each of the properties of atomic broadcast
when the processes in F --H p are assumed to be faulty.
Agreement and total order: To prove that B' satisfies both the agreement and the
total order properties, we show that all processes not in (F --H p) deliver the same
sequence of messages in B' (i?e., Vq ? F --H p: D?vDq(B') =
Let q be any process ? F --H p. Suppose q $ p; thus q ? F, and by construc-
tion, DffiVDq(B') = D?vDq(B) = ??v?(B). Suppose q = p; by construction of B',
D?vDq(B') =
Validity: Let q be any process ? F --H p. To prove that B' satisfies the validity
property, we show that q delivers all of its own broadcasts in B'; i.e., BOASTq(B1) C
DLVDq(B')
65
Suppose q # p; thus q ? F. Since B satisfies atomic broadcast when the pro-
cesses in F are assumed to be faulty, BCASTq(B) C DLVDq(B). By construction,
BCASTq(B') = BCASTq(B), and DLVDq(B') = DLVDq(B). Thus, BCASTq(B') C
DLVDq(B').
Suppose q = p. By construction, BCASTp(B') C DiNDF(B) Since D?vDp(B') --H
??v?(B), BCASTp(B') C DLVDp(B').
Integrity: Let q be any process. To prove that B' satisfies the integrity prop-
erty, we must show that q does not deliver any message more than once. This is
straightforward, and hence the proof is omitted.
To prove that B' satisfies the integrity property, we must also show that if q
delivers a message, then that message is actually broadcast by some process. The
proof is a case analysis and we only show one case below.
Case: Process q Ei F --H p delivers m = (p, --H, --H) by time c in B': Since ?F1--Hp ?
?F--Hp? q also delivers m by time c in B (i.e., m ? DLVDCq(B)) We show below that p
also broadcasts m in B'.
Since q delivers m in B, and B satisfies the integrity property, p broadcasts m in
B. Since m ? DLVDqC(B), by hypothesis, m E DLVD?F(B). By construction, since p
broadcasts m in B and m ? DLVDF(B), p also broadcasts m in R1.
Thus, B' is a D-extension of (B, p, c) with respect to atomic broadcast and F,
and hence we have proven that p is D-correct until c in B with respect to atomic
broadcastandF			0
Lemma 5.2 above gave necessary and sufficient conditions for a process to be D-
correct with respect to atomic broadcast. A weaker but more "elegant" formulation
is given in Lemma 5.3. It is stated using the following uniform agreement property,
which informally requires that all processes, correct or faulty, satisfy the agreement
property:
o+ Uniform agreement: If any process p delivers a message m, then all correct
processes eventually deliver m.
Uniform agreement is also called uniformity in the literature [NT9(),GT89]
Lemma 5.3 Let I? = (p, S) ? B and F be such that B satisfies atomic broadcast
with respect to F Suppose B satisfies the uniform agreement property with respect
to F.
Process p is D-correct until c with respect to atomic broadcast and F if and only
if n?vDpC(B) ?
66
Proof:			From Lemma 5.2.			0
Having stated what it means for a process to be D-correct, we can now prove a
sufficient condition for an application history to be delivery consistent.
Lemma 5.4 Let A ? A, B be the broadcast history of A, and F be such that B
satisfies atomic broadcast with respect to F. If for all processes p, for all times c,
D?vDpC(B) ? ??v?(B), then A is a D-consistent history with respect to atomic
broadcast and F.
Proof: Suppose for all processes p, for all times c, ??vi:??c(B) ?
Clearly, this implies that any message delivered by any process (correct or faulty) at
any time, is also eventually delivered by all the correct processes; i.e., B satisfies the
uniform agreement property when the processes in F are assumed to be faulty.
To complete the proof, we show that for all processes p, for all c < LVT(A, p), p
is D-correct until time c in B with respect to atomic broadcast and F.
Let p be any process. If p is correct, then it is trivial to show that for all c, p is
D-correct until time c in B with respect to atomic broadcast and F.
Suppose p is faulty (p e F). Let c be any time < LVT(A,p). Since D?vDpC(B) ?
DLvD?(B), and B satisfies the uniform agreement property with respect to F, Lemma
5.3 implies that p is D-correct until c in B with respect to atomic broadcast and F. 0
Note that the converse of the lemma is false. That is, if A is a D-consistent history
with respect to atomic broadcast and F, then there may be a process p and a time
c such that D?vDpC(B) ? ??v??(B). This is because a D-consistent history permits
the following scenario:
o+ A faulty process delivers a message m at some time c, and halts before broad
casting a message at time c, or changing state at time c.
o+ No correct process ever delivers m.
Thus, the sequence of messages that process p delivers by time c is not a prefix of
the messages delivered by the correct processes during the entire execution. Note
however that p's "erroneous" delivery at time c is not reflected in p's state, nor in
the states of the other processes in the system.
Visible-broadcast/delivery correctness and consistency
The following lemma describes when a process is VBD-correct with respect to atomic
broadcast. The proof is similar to that of Lemma 5.2, and hence is omitted.
67
Lemma 5.5 Let B = (p, ?) c B and F be such that B satisfies atomic broadcast
with respect to F Process p is VBD-correct until c with respect to atomic broadcast
and F if and only if:
o+ The sequence of messages p delivers by time c is a prefix of the sequence of
messages the correct processes deliver during the entire execution:
D?vDpC(B) ?
o+ If a faulty process q (i.e., q ? F) delivers a message m that p broadcasts at
< c then a correct process eventually delivers m.
o+ If a correct process q (i.e., q ? F) delivers a message m that p broadcasts at
any time, then q eventually delivers all the messages that p broadcasts before
However, a weaker, but more convenient form of the above lemma is given below
using the following fifo order property:
o+ FIFO Order: If any process broadcasts a message m before it broadcasts a
message m', and a correct process p delivers m', then p delivers m before m'.
Consider an execution in which processes communicate via atomic broadcast,
such that the uniform agreement and fifo order properties are also satisfied. Lemma
5.6 states that a process p is VBD-correct until time c if and only if the sequence of
messages p delivers by time c is a prefix of the sequence of messages that the correct
processes eventually deliver.
Lemma 5.6 Let B = (p, S) ? !3 and F be such that B satisfies atomic broadcast
with respect to F. Suppose B satisfies the uniform agreement and fifo order properties
with respect to F.
Process p is VBD-correct until c with respect to atomic broadcast and F if and
only if D?vDpC(B) ?
Proof: Omitted for brevity.
E]
Lemma 5.7 Let A e A, B be the broadcast history of A, and F be such that B
satisfies atomic broadcast with respect to F. If B satisfies the fifo order property with
respect to F, and for all processes p, for all times c, Drvwc(B) ? D?vH?B), then A
is a VBD-consistent history with respect to atomic broadcast and F.
Proof: From Lemma 5.5.
c
Since it is easy to enforce the fifo order property (e.g., using message sequence
numbers), Lemmas 5.4 and 5.7 show that providing visible-broadcast/delivery con-
sistency is only slightly more expensive than providing delivery consistency.
68
Broadcast/delivery correctness and consistency
Finally, we state the conditions when a process is BD-correct with respect to atomic
broadcast.
Lemma 5.8 Let 13 = (p, ?) Ei B and F be such that 13 satisfies atomic broadcast
with respect to F. Process p is BD-correct until c with respect to atomic broadcast if
and only if:
o+ D?vDpC(13) ?
o+ BCASTpC(13) c DLVD?F(13)
o+ If a correct process q (i.e., q ? F) delivers a message m that p broadcasts,
then for all messages m' that p broadcasts before m, q eventually delivers m'.
The following lemma is weaker, but more "elegant."
Lemma 5.9 Let B = (P, ?) ? 13 and F be such that B satisfies atomic broadcast with
respect to F. Suppose 13 satisfies the fifo order property with respect to F. Process p
is BD?correct until c with respect to atomic broadcast if and only if:
o+ D?vDpC(13) ?
o+ BOASTpC(13) c DLVDF(13)
Finally, we state a sufficient condition for a BD-consistent application history:
Lemma 5.10 Let A ? A, B be the broadcast history of A, and F be such that 13
satisfies atomic broadcast with respect to F. If B satisfies the fifo order property with
respect to F, and for all processes p, for all times c, D?vDpC(13) ? ??v?(?), and
BCASTpC(13) c DLVDF(13), then A is a BD-consistent history with respect to atomic
broadcast and F
5.1.3 Contamination with atomic broadcast
In this section, we study contamination with respect to atomic broadcasts. Recall
that contamination is the spread of "incorrectness" from the faulty processes to
the correct processes. We defined three forms of contamination in Chapter 3, one
corresponding to each of the three correctness conditions we proposed. We showed
that BD?contamination and VBD-contamination are equivalent. We also showed that
a process that is D-contaminated is also BD-contaminated; however, if a process is
BD-contaminated, then it is not necessarily D-contaminated. In other words, if BD-
contamination is prevented, then VBD?contamination and D-contamination are also
prevented. Hence, we concentrate on the prevention of BD-contamination.
In Theorem 5.11, we show when a process is BD-contamination-free with respect
to atomic broadcast.
69
Tbeorem 5.11 Let B = (p, S) Ei 13 and F be such that B satisfies atomic broadcast
with respect to F. Suppose B satisfies fifo order with respect to F. A correct process
p is BD-contamination-free until c with respect to atomic broadcast and F if and only
if:
o+ ifp delivers any message m at or before time c, then ? DT+vD_(B).
Proof of the aonly if" assertion of Theorem 5.11: Suppose a correct process
p is BD-contamination-free until c with respect to atomic broadcast and F. Suppose
that p delivers m = (q, c', --H) by time c. Suppose for contradiction that D?vDqCI(B) ?
DLVDp(B).
Lemma 5.8 implies that if q is BD-coued until c1, then D?vDqCI(B) ? D?vDp(B)
Since D?vDqCI(B) ? D?vDp(B), q is not BD-correct at c1. Thus, by delivering m by
time c, p is BD-contaminated by time c, a contradiction.
Proof of the aif" assertion of Theorem 5.11:
Suppose whenever a correct process p delivers any message m* at or before some
time c, DLyDt6s??mW?)?(B) D? D?vDp(B)
Suppose for contradiction that p becomes BD-contaminated by delivering some
message m at some time c1 < c. Let q = bc(m). By hypothesis, ??y?tqS(?)(?) ?
DLVDp(B). The proof continues as a case analysis.
Case: g is not BD-correct until ts(m): Since p is correct and delivers m, the fifo
order property implies that for all messages m1 that q broadcasts before time ts(m),
p delivers m' before m. Thus, ??A?TtqS(Tfl)(B) C DLVDp(B).
Since p is correct, Lemma 5.1 implies that D?vDp(B) = ??v?(B). Since ?????tqS(lfl)(?) c
DLVDp(B) and ?i4?tqS(lfl)(?) ? D?vDp(B), we conclude that C DLvD??(B),
and ? n?v???(B). Furthermore, since B satisfies the fifo order prop-
erty when the processes in F are assumed to be faulty, Lemma 5.9 implies that q is
BD-correct until ts(m), a contradiction.
Case: q is BD-correct until ts(m) and ?(r, c*) HE (q, ts(m)), and r is not BD-
correct until c*: Thus, q delivers some message ?? at some time < ts(m)
(i.e., m1 ? D?vDtqS(lfl)(B)), and (r,c*) HE (bc(m'),ts(m')). Since by hypothesis,
? D?vDp(B), p also delivers m? in B.
By the well-formedness of B, if any process delivers m, then it delivers m after
time ts(m); in particular, m ?			Since p delivers m and m1, and
e DrvD;s(m)(B), and			? D?vDp(B), we conclude that p delivers m'
before m.
70
Therefore, by choice of m', p becomes BD-contaminated by delivering m' (before
delivering m). This contradicts the choice of m, and completes the proof. 0
5.2 Timed atomic broadcasts
Atomic broadcast does not place any restriction on the time a correct process may
take to deliver a message. However, by augmenting the specifications of atomic
broadcast with the following A-timeliness property, we can bound the time required
for message delivery.
o+ A- Timeliness: If any process p delivers a message m at time c on p's clock,
then c is at most ts(m) + A.
The constant A is called the latency of the broadcast. We assume that the latency
of any broadcast must be greater than 0.
Processes that communicate via timed atomic broadcasts are subject to two
sources of incorrectness that are not present with atomic broadcast.
Suppose a correct process delivers a message m, and p does not deliver m by
time ts(m) + A. If p never delivers m, then p "violates" the agreement property.
Furthermore, if p delivers m after ts(m) + A, then p "violates" the A-timeliness
property.
Now suppose that p broadcasts a message m, and does not deliver m by time
ts(m) + A. In this case, if p never delivers m, then p "violates" the validity property.
As before, ifp delivers m after ts(m)+A, then p "violates" the A-timeliness property.
From this, it is straightforward to derive a sufficient condition for an application
history to be VBD-consistent with respect to timed atomic broadcast.
Lemma 5.12 Let A ? A, B be the broadcast history of A, and F be such that B
satisfies timed atomic broadcast with respect to F. Suppose B satisfies the fifo order
property with respect to F. If for all processes p, for all times c:
o+ D?vDpC(B) ? ??v??(B) (as with atomic broadcast), and
o+ p is not alatefl at c in Bj that is:
- p delivers its own messages (BCASTpC?A(B) c DLVDpC(B)), and
--H p delivers messages delivered by the correct processes (Vm ? DLVDF(B)
ts(m) ? c --H A ? m ? DLVDpC(B))
then A is a VBD-consistent history with respect to timed atomic broadcast and F.
It is also straightforward to derive necessary and sufficient conditions for a history
to be BD-contamination-free with respect to timed atomic broadcast.
71
Lemma 5.13 Let B = (P, ?) ? 13 and F be such that B satisfies timed atomic
broadcast with respect to F. Suppose B satisfies the fifo order property with respect
to F. A correct process p is BD-contamination-free until c if and only if:
o+ if p delivers a message m at or before time c, then
D?vDt?8c(???)?(B)
-			? ??v? (B) (as before), and
--H For all c' < ts(m), bc(m) doe,s not appear to p to be "late" at c'; that is:
o+ BcAsTbCW(rn$(B) c DLVDb%(m)(B)
o+ Vm E Di;VD?c'(B): ts(m) < c' --H A ? m ? DiNDb%'(rn)(B)
5.3 The independence property
In Chapter 3, we defined the independence and fifo independence properties. We
showed that if a broadcast specification ?B has the independence property, then a
process p that is D-correct until some time c with respect to ?B and a failure set F
also exhibits crash behavior until c. We also showed that if ?B only has the weaker
fifo independence property, then p exhibits crash behavior until c if it is VBD-correct
until c.
In this section we show that timed atomic broadcast has the independence prop-
erty; thus atomic broadcast also has the independence property. Thus, the results
presented in Chapter 3 imply that with (timed) atomic broadcast, crash behavior,
D-correctness, VBD-correctness and BD-correctness form a hierarchy of increasingly
stringent restrictions on the behavior of faulty processes during execution.
For convenience, we repeat the definition of the independence property:
Definition Broadcast specification ?B has the independence property if for all B =
(P, ?) ? 13, for all F such that ?B(B, F), for all correct processes p (p ? F), for
all times c such that p(p, c) = ?, for all messages m of the form (p, c, --H), and for
all times c', c < c' < c + A --H 1, there is a B' = (p', ?) Ei 13, such that ?B(B1, F),
? and [p1 = P except P'(p,c) =
Theorem 5.14 Timed atomic broadcast has the independence propert?.
Proof: Let B = (P, S) ? 13 and F be such that B satisfies atomic broadcast with
respect to F. Let p be a correct process, and c be a time such that p(p, c) = ?. Let
m be any message of the form (p,c,--H) and let c1 be any time c < c1 < c+ A --H 1.
To show that atomic broadcast has the independence property, we construct a --H
(p', ?) ? 13, such that ?B(B', F) where ? =? S and (P' = P except P'(p, c) =
The construction of B' is as follows:
72
1.
All processes broadcast the same messages at the same time in B and B', ex-
cept that p also broadcasts the message m at time c in BI. Formally: P' --H
P except P'(p,c) =
2. The faulty processes deliver the same messages at the same time in B and B'.
Until time cl, the correct processes deliver the same messages at the same time in B
and B'. Formally: 5F' = SF and 5F1 =?
3. After time c', the correct processes deliver messages as follows:
Let x be a correct process that has delivered the most messages by time c'; i.e.,
? F and vq ? F : D?vDqCl(B) ? DT+vD:C'(B) Thus, for all correct processes q
? F), ??vDqCt(B) 5q = ??v?cJ(B)], for some sequence of messages 5q Note that
5: is the empty sequence.
Let q be any correct process ? F). Informally, at time c' + 1, q first "catches
up" with the messages that process x has delivered by time c', and then delivers m.
After delivering rn, q delivers the same messages in B' that x delivers in B. Formally:
S'(q,c' +1) = 5q (m) S(x,c' + 1)
Vc" > c' + 1 S'(q, c") = S(x, c11).
Claim: B1 is well-formed.
Proof of claim: We show below that B' satisfies the three properties required of a
well-formed history (as defined in Chapter 2).
1. It is clear that P' is a history broadcast function and 5' is a history delivery
function.
2. Suppose for some process q, time c*, and message m* $ ?, m* e S'(q, c*). We
prove that ts(m*) <c*.
Suppose that q is faulty (q E F), or that c* < c'. Since SF' = 5F and 5F =? 5F'
51(q, c*) = 5(q, c*). Since B is well-formed (by hypothesis), ts(m*) < c*.
Suppose that q is correct, c* > c', and m* $ m. From the construction, process
x delivered m* at c* in B. Since B is well-formed (by hypothesis), ts(m*) < c*
Suppose that q is correct, c* > c', and m* = m. From the construction, q delivers
m at time c' + 1; i.e., c* = c' + 1. Since ts(m) = c, and c < c' and c* = c1 + 1,
ts(m) < c*.
3. Consider any message m* $ ? broadcast by any process q at any time c*in B1.
From the construction, and the well-formedness of B, it is straightforward to show
that m* is of the form (q, c*, --H).
73
Claim: ? =? ? and (P' = P except P'(p, c) =
Proof of claim: Directly from the construction.
Claim: B' satisfies timed atomic broadcast with respect to F.
Proof of claim: We show that B' satisfies each of the properties of aton?ic broadcast
when the processes in F are assumed to be faulty.
Agreement and total order: To prove that ?? satisfies both the agreement and the
total order properties, we show that any correct process delivers the same sequence
of messages in B' as any other correct process.
Let q and r be correct processes (i.e., q ? F,r ? F). From the construction,
DU+VDq(B') = ??v??(B) (m? S(x, c' +1)..., and D?vDr(B') = ??v?2c'(B)
?(x, c' ....... Thus, D?vDq(B') = D?vDq(B')
Validity: Let q be any correct process (q ? F). To prove that 1?' satisfies the validity
property, we show that q delivers all of its own broadcasts in B'; i.e., BCASTq(B') C
DLVDq(B').
Since B satisfies atomic broadcast with respect to F, BCASTq(B) C DLVDq(B).
Note also that by construction, DLVDq(B') = DLVDq(B)U fm?.
If q $ p, then by construction, BCASTq(B') = BCASTq(B), and hence
BCASTq(B') C DLVDq(B').
If q = p, then by construction, BOASTq(B') = BCASTq(B)U ?m?, and hence
BCASTq(B') C DLVDq(B').
Integrity: It is straightforward to show that the integrity property is satisfied.
A- Timeliness: Let q be any process, and suppose q delivers some message m* at
time c* in B'. To prove that B' satisfies the A-timeliness property, we show that
< ts(m*) + A.
Suppose q is faulty. Since = ?, q delivers m* at time c* in B. Since B satisfies
timed atomic broadcast (i.e., B satisfies A-timeliness), c* < ts(m*) + A.
Suppose q is correct, and m* $ m. From the construction, at least one of
process q and process r delivers m* at time c* in B. Since B satisfies A-timeliness,
? ts(m*) + A.
Suppose q is correct, and m* = m. By construction, m is delivered at time c' + 1,
where c ? c' ? c + A --H 1. Since ts(m) = c, and c* = c? + 1, c* ? ts(m) + A.
74
Corollary 5.15 Atomic broadcast has the independence property.
Proof: From Theorem 5.14.
5.4 The choice property
0
In Chapter 4, we defined the choice property. We proved that when solving a certain
class of problems using broadcasts that satisfy the choice property, the prevention of
contamination is "as good as" the prevention of inconsistency.
For convenience, the definition of choice property is repeated below:
Definition Broadcast specification ?B has the choice property iffor all B = (p, S) ?
13, for all F such that ?B(B, F):
o+ For all m, if B' = Subtract(B, (mi), then B' e 13 and ?B(B', F), and
o+ For all p e F, for all c, if B' = Deafen(B, f(p, c)?), then B' E 13 and
?B(B', F).
Atomic broadcast has the choice property. To prove this, we must show that
when processes communicate via atomic broadcast:
o+ Any process p can unilaterally decide whether or not to broadcast a message
at any time c.
o+ Any faulty process p can unilaterally decide whether or not to stop delivering
messages after any time c.
Since timed atomic broadcast and atomic broadcast place no restriction on when
a process can broadcast a message, or on the behavior of faulty processes, (other
than the integrity and A-timeliness properties), it is straightforward to show that
both the broadcasts have the choice property. In the interest of brevity, we omit the
proofs.
Theorem 5.16 Timed atomic broadcast and atomic broadcast have the choice prop-
erty.
Chapter 6
Extending the Formal Model
Few distributed systems provide fault-tolerant broadcasts as "built-in" communi-
cations primitives. Hence fault-tolerant broadcasts are implemented by protocols
that use the low-level communications mechanisms available in the system--Hsuch
protocols are called broadcastprotocols.
This chapter extends the model introduced in Chapter 2 to describe the execution
of protocols based on fault-tolerant broadcasts in systems where processes commu-
nicate via point-to-point messages. We only consider systems where processes have
perfectly synchronized clocks and link delays are bounded. We assume that processes
may only fail by omiss?on; i.e., by prematurely halting, or by intermittently omitting
to send and/or receive messages.
We define what it means for a broadcast protocol 0- to imptement a broadcast
specification ?B in a system 5 and prevent inconsistency and/or contamination.
Informally, such a protocol only permits broadcast histories that satisfy ?B and
satisfy consistent and/or contamination-free behavior with respect to the processes
that actttallyfailed during execution; t.e., when the processes that are assumed faulty
are exactly those processes that are actually faulty. To prevent a faulty process from
becoming inconsistent (e.g., by delivering an incorrect message m), the broadcast
protocol is permitted to "force" such a process to halt. However, if the protocol halts
a process, then we require that the process must already have failed (for example,
by omitting to send or receive a point-to-point message).
Finally, we show that if protocol II solves a problem using a broadcast specifi-
cation assuming consistent or contamination-free behavior, and broadcast protocol
e implements the broadcast in a system 5, ensuring consistency or contamination-
freedom (as appropriate), then II solves the problem using e in 5.
75
76
6.1 Extended model
In this section, we extend the formal model of computation presented in Section 2.1.
Where possible, we will not repeat any information already presented in that section.
Let ? denote the set of processes in a distributed system. The processes com-
municate by message passing along bidirectional communication links (sometimes
abbreviated to links). The processes execute an application protocol which assumes
that the processes communicate exclusively via a fault-tolerant broadcast. The pro-
cesses concurrently execute a broadcast protocol, a protocol that implements the
broadcast when processes communicate via point-to-point message passing.
Each process maintains a local clock taken from the set of positive integers, de-
noted 1. In general, a process's clock time may be different from the real time. Let
1Z denote the set of real times.
6.1.1 Process state
The state of each process consists of a application state and a communication state.
The application state is taken from the set Q?, and informally corresponds to the
part of the process state that is "associated" with the application protocol that the
processes are executing. The communication state is taken from the set Qe, and
informally corresponds to the part of the processes state that is "associated" with
the broadcast protocol that the processes are executing. Both the set of application
states and the set of communication states include the special state 1, called the
premature halt state.
6.1.2 Sending and receive messages
The processes communicate by sending and receiving point-to-point messages. A
point-to-point message is a tuple of the form (p, q, c, data), where p is the sender of
the message (denoted snd(m)), q is the intended recipient of the message (denoted
rpt(m)), c (the timestamp) is the time on p's clock at which the message was broad-
cast (denoted ts(m)), and data is a sequence of bits. Let s denote the set of all
point-to-point messages, and let S+ --H 2? denote the set of all subsets of s.
If a process p invokes send (S) at clock time c, where S is a set of point-to-point
messages, then for each m E S, we say that p sent m at c. If the result of process p
invoking the receive primitive at time c is the set of point-to-point messages S, then
for each message m E S, we say that p receives m at c. This is further described in
Section 6.2.
77
6.1.3 Broadcast protocols
Processes execute a broadcast protocol, 0, which specifies the messages to be deliv-
ered by the processes, the point-to-point messages to be sent by the processes and
the communication state transitions made by the processes. A broadcast protocol
consists of four functions: the delivery function, denoted 0d? the sending function,
denoted 0?, the communication state transition function, denoted 0r and the halt-
decision function, denoted Oh
The delivery function determines the next messages to be delivered; formally,
0d : ? X I x Qe x S+ M+. If at time c, p is in communication state 5 and
receives the set of point-to-point messages S, then p should deliver the sequence of
messages 0d(P, c, 5,
The sending function determines the next messages to be sent; formally, 08
? x I x Qe >c x (M U ? S+. If at time c, p is in communication state 5,
receives the set of point-to-point messages 5, and wishes to broadcast the message
m, then p should send the set of point-to-point messages 0?(p, c, 5, 5, m).
The communication state transition function determines the next communication
state; formally, Or :			x I x Qe x			x (MU?)			Qe. If at time c, p is in
communication state 5, receives the set of point-to-point messages 5, and broadcasts
the message m, then p should enter communication state Or(p, c, 5, 5, m).
6.1.4 Premature halting
The set of possible communication states includes the special state I, the (pre-
mature) halt state. Recall that Chapter 2 stipulated that a protocol II cannot
halt a process by changing its state to 1 from a state other than I. In con-
trast, a broadcast protocol's halt-decision function 0h can halt a process. Formally:
0h : ? x Ix Qe x H fOK,I?. If at time c, p is in communication state
and receives the set of point-to-point messages 5, then p should prematurely halt if
0h(P, c, 5, 5) is 1.
Once a process halts, it remains halted:
o+ Vp,Vc,V5,Vm: Or(p,c,I,S,m)=I.
Once a process halts, it does not subsequently send any point-to-point messages:
o+ Vp,Vc,V5,Vm: Os(p,c,I,5,m)=?.
Once a process halts, it does not subsequently deliver any messages:
o+ Vp,Vc,VS: 0?(p,c,I,S)=&
78
6.1.5 Execution of an application protocol using a
broadcast protocol
7			Initialization
(sn,se) (5n,init,se,init) ? (1,1)
c := 1
Main Loop			*1
do forever
(sn,se)			(sn,se) ? (1,1)
5 = receive() /? e receivesmessages from other processes */
if 0"?(p??se?5) = I
then (sn,se)			(I, I)			/* e decides if p should halt
D --H= deliver() = ??(p, c,?e,			5)			/* e "tells" II to delivermessages			*1
(sn,se) := (sn,se) ? (1,1)
m := Hm(p, c,s11,D))
broadcast m			/? 11 "tells" e to broadcasta message			*1
(sn,se) := (sn,se) ? (1,1)
send (Oa(p,c,?e, 5, m))			/* e sends messages to other processes			*1
(s11, se) := (flr(p, c,?ll,D),			O"r(p,c,?e, 5, m)) ? (I, I)
c := ct 1
od
Figure 6.1: The execution of II by process p
The execution of a protocol II using broadcast protocol e is illustrated in Figure
6.1 (compare with Figure 2.1 in Chapter 2). Each process is initialized to application
state ?H,init, communication state se,???t and to a clock value of 1. At each clock tick,
the broadcast protocol layer receives point-to?point messages from other processes,
and determines whether or not the process must halt. The broadcast protocol layer
then instructs the application protocol layer to deliver a sequence of messages. The
application protocol layer subsequently informs the broadcast protocol layer if a
message is to be broadcast, Then, the broadcast protocol layer sends a set of point?to-
point messages to other processes. Finally, the process changes state and increments
its clock.
79
A process may also non-deterministically halt during execution as the result of
a failure; such a halt is denoted (sii, se) := (5n, se) ? (I, I). As indicated in
Figure 6.1, we stipulate that such a failure may only occur at certain points during
execution. This assumption does not, however, affect the generality of our results.
6.2 Point-to-point message passing
We assume that point-to-point message passing is reliable. Informally, a point-to-
point message sent by a correct process reaches its destination without corruption or
duplication. Furthermore, if any process receives a point-to-point message, then that
message was actually sent by some process. We also assume that a correct process
always receives a point-to-point message sent by another correct process.
Let the link delay A be an upper bound on the time required for a point-to-point
message sent by a correct process to be received by another correct process.
To model reliable point-to-point message passing, we assume that each process p
has an inp?t and an ottpvt buffer for point-to-point messages, and that both buffers
are initially empty. Let IBp?c denote the contents of p's input buffer immediately
after p's clock changes to c, and let OBp,c denote the contents of p's output buffer
immediately before p's clock changes to c + 1. We assume that for all processes p,
IBp,1 and OBp?o are
o+ Sending: Suppose process p invokes send(S) at time c: If m = (p, q, --H, --H) E 5,
then m should be added to OB?,?; i.e., m E OBp,c. However, if p fails during
its invocation of the send primitive, only a subset of 5 may be added to OBp,c.
If m ? 5 and m is added to OBp,c, then we say p actvally sends m at time c.
o+ Reliable transmission: Suppose point-to-point message m = (p, q, --H, --H) is
added OB?,c (at time c); i.e., m ? OBp,e?i and m ? OB?,c: At some time
cl, c < c' < c + A, m is removed from OBp,c and added to IBq,c? as a single,
indivisible action; i.e., m ? IBq,c? and m ? OBpct.
o+ Receiving: Suppose process p invokes the receive primitive at time c: This
should result in the following indivisible actions--Hthe receive primitive returns
5, the set of point-to-point messages in IBp,c, and resets the input buffer to ?;
i.e., IBp,c is returned, and Vm: m ? IBp,c, ? m ? IBp,c+i.
If 5 is the result of p invoking receive () at time c, for all m ? 5, we say p
receives m at time c.
If p fails during the receive primitive, the set of messages returned by the
receive primitive may only be a subset of IBp,c. Furthermore, for all messages
m ? IBp,c, if m is not returned by the receive primitive, then p never receives
m; z.e., after the receive primitive, the input buffer is reset to ?.
80
6.3 Extended histories
The following functions describe the execution of an application protocol II using a
broadcast protocol e.
o+ The functions ?, P and S, which were defined in Chapter 2.
o+ The function ae : ? x I H Qe is called the history communication state
function--Hae(p, c) is process p's communication state when its clock changes
to c.
The function ? : ? x I H S+ is called the history sendin? funchon--H?(p, c)
is the set of point-to-point messages that p actually sent (added to its output
buffer) at clock time c; i.e., ii(p, c) = OBp,c --H OBp,c--Hi.
o+ The function p : ? x I H S+ is called the history receivin9 finction--Hp(p, c)
is the set of point-to-point messages that p actually receives at clock time c.
o+ The function T : x 1? H I is called the history real time function--Hr(p, t) is
p's clock time at real time t.
Let II be an application protocol, e be a broadcast protocol, a, ? and S be
history state, broadcast and delivery functions respectively, and ae, ? and p be
history communication state, sending and receiving functions respectively. The tuple
E = ((II, a, ?, ?), (?, ae, ?, p, T)) is an extended history of protocol II using broadcast
protocol e (or the history of II usin9 e). The tuple H = (11, a, ?, S) is the protocol
history of E. The tuple A = (a, P, S) is the application history of E (also of H).
The tuple C = ((P, S), (e, ae, 1, P, r)) is the communication history of E. The
tuple B = (P, ?) is the broadcast history of E (also of H and of C). If C --H
((P, S), (e, ae, ?, p, r)), then C is a communication history usin9 e (C uses
A communication history C = ((P, ?), (?, ae, 1, P, r)) is well-formed if:
o+ 0- is a broadcast protocol, ae is a history communication state function, ? is
a history sending function, and p is a history receiving function.
o+ 1? = (P, ?) is a well-formed broadcast history.
o+ Every process initializes its communication state to ?e,???t or halts before ini-
tialization.
Vp: ae(p, 1) = se,???t or ae(p, 1) = I.
o+ Once p halts, it remains halted, and sends and receives no further point-to-
point messages, and broadcasts and delivers no further messages:
Vp,Vc: ae(p,c)=I ae(p,c+l)=tand
?(p, c) = p(p, c) = P(p, c) = S(p, c) =
81
o+ Once p is "forced" to halt, it remains halted, and sends no further point-to-
point messages, and broadcasts and delivers no further messages:
Vp7Vc: ??(p,c,ae(p,c),p(p,c)) = I
ae(p,c+ 1) = land `t(p,c) = P(p,c) = S(p,c) =
o+ Every process changes communication state according to 0-, or halts:
Vp,Vc: ae(p,c+ 1) = ?7(p,c,ae(p,c),p(p,c),P(p,c)) or ae(p,c+ 1) = I.
o+ Every process delivers messages according to 0, or halts:
Vp, Vc: ?(p, c) = 0d(P, c, ae(p, c), p(p, c)) or (S(p, c) = ?, ae(p, c + 1) = I).
o+ At all times c, if a process sends or receives a point-to-point message, then the
point-to-point message is of the form (p, --H, c,
Vp,VQVrn#?: m??p,c)orm?p(p,c)
o+ No process sends "extra" point-to-point messages:
Vp,Vc: t(p,c) C 0s(p,?ae(p,c),p(p,c),P(p,c)).
o+ Every process receives any point-to-point message m at most once:
Vp,Vc: Vm?p(p,c) ? Vc'#c:m?p(p,c').
o+ If any process receives a point-to-point message, then the message was sent:
Vp,Vc: Vm?p(p,c) ? ?q,?c':mE4q,c').
o+ Message receipt does not take "zero time," and takes no longer than A time:
Vp,Vc:Vm?p(p,c) ? O<c--Hts(m)<?A.
o+ All processes have perfectly synchronized clocks:
Vp,Vq,Vt: r(p,t) =
Note that a well-formed communication history corresponds to the communication
history of an execution in which processes are not subject to Byzantine (i.e., arbi-
trary) failures [LSP82], message passing is reliable, and clocks are perfectly synchro-
nized.
An extended history E = ((II, ? p, S), (0, ae, ?, p, r)) is well-formed if:
o+ c = ((P,S),(0,?e,?,p,r)) is well-formed.
o+ H = (ll,a,P,S) is well-formed.
o+ When a process halts, both the communication state and the application state
are set to I:
Vp,Vc: ?e(p,c)=l ? a(p,c)=l.
6.4 Systems and process failures
A system 5 is a set of well-formed communication histories. We say a communication
history C is a (communication) history of system 5, if C ? 5. An extended history
E is a history of 5 (denoted E ? S) if C, the communication history of E, is such
82
that C ? s. A broadcast history B is a (broadcast) history of S (denoted B E S)
if B is the broadcast history of a communication history C and C c s. A protocol
history H is a protocol history of S, (denoted H c 5) if B, the broadcast history of
H, is such that B c 5.
Suppose C = ((P, ?), (?, ae, 1, P, T)) is a well-formed communication history. We
first define what it means for a process p to fail to send and receive messages at time
o+ p sends correctly at c in C if ?(p, c) = ?8(p, c, ae(p, c), p(p, c), P(p, c)).
o+ p receives correctly at c inC if Vq,Vm : m Ei ?(q,c--H A) A p = rpt(m) ?
act <c: m ? p(p,c')
The above definitions lead to the following definitions:
o+ A process p is correct at c in C if p does not halt by time c (??(p, c + 1) $ I),
and for all c' < c p sends and receives messages correctly at time c' in C.
o+ A process p fails by omission by time c in C if p halts by time c, or there is a
time c' ? c, such that p fails to send or receive messages correctly at time c'in
Note that p is not correct at c in C if and only if p fails by omission by c in C. Since
we are assuming that all systems are sets of well-formed histories, the above implies
that we only consider systems in which processes have perfectly synchronized clocks,
and may fail by omission. In particular, we do not consider systems that are subject
to timing failures [CASD85].
Let ?cc denote the set of processes that are not correct at time c in C (?cc = (p I p
is not correct at c in C?), and let ?c denote the set Uc>0 ?Cc.
If E is an extended history, and C is E's communication history, then for all
times c, ?EC = ?cc, and hence ?E =
The maximum number of faulty processes in a system 5, denoted f, is equal to
max?I?cI I C ?
6.5 Consistency and contamination in the
extended model
In this section, we define consistency and contamination-freedom in communication
histories.
Let C = ((P,?),(O,ae,?,p,?)) be any communication history, and let B --H
(P, ?) For the rest of this section we will be using this C and B.
Definition Process p is X-contamination-free in C with respect to ?B and 5 if for
all c, p is not X-contaminated at time c in 13 with respect to ?B and ?c
83
We define Halt(C,p) to be the time at which p halts in C, and LVT(C,p) to be
the time of p's last "visible action" in C. Our definitions are analogous to those in
Chapter 3 of Halt(A, p) and LvT(A, p) for an application history A.
co			if p does not halt
Halt(C,p)			=			min(c I ae(p,c + 1) = I)? otherwise
LVT(C,p)			=			max (Halt(C,p) --H 1, LB(B,p))
Thus, we define a consistent process as follows:
Definition Process p is X-consistent in C with respect to ?B and 5 if for all c <
LVT(C,p), p is X-correct until time c in B with respect to ?B and ?c.
We now define consistent and contamination-free communication histories.
Definition C is X-consistent (x-contamination-free) with respect to ?B and 5 if
and only if for all p, p is X-consistent (x-contamination-free) in C with respect to
?B and 5.
Recall that C = ((P,?),(?,ae,?,p,r)) and B = (p,S). Consider any extended
history E, such that C is E's communication history. Suppose E = ((II, a, ?, S), (e, ae, 1, P,
Let A = (a, P, ?) Lemmas 6.1 and 6.2 compare x-consistent and X-contamination-
freedom in the two histories C and A.
Lemma 6.1 Process p is X-consistent in C with respect to ?B and 5 if and only if
p is X-consistent in A with respect to ?B and ?c.
Proof: If p is X-consistent in C with respect to ?B and 5, then for all c _
LVT(C,p), p is X-correct until time c in B with respect to ?B and ?c. If p is x-
consistent in A with respect to ?B and ?c, then for all c < LVT(A,p), p is X-correct
until time c in B with respect to ?B and ?c. Thus, the lemma follows if we show
that LVT(A,p) equals LVT(C,p).
Since E is well-formed, Vp,Vc : a(p,c) = I ae(p,c) = I. Thus,
Halt(A,p) = Halt(C,p), and hence LVT(A,p) = LVT(C,p). c
6.2 Process p is X-contamination-free in C with respect to ?B and 5 if and
is x-contamination-free in A with respect to ?B and ?c.
Lemma
only if p
The following lemma follows from definitions and the above lemmas:
Lemma 6.3 C is X-consistent (x-contamination-free) with respect to ?B and 5
if and only if A is x-consistent (x-contamination-free) with respect to ?B and
Failed(C, A).
84
6.6 Broadcast protocols implementing a
broadcast specification
So far, we have not excluded trivial broadcast protocols, such as one which forces
all processes, correct and faulty, to halt. To preclude such protocols, we first define
what it means for a broadcast protocol e to respect correctness in a communication
history C. Informally, such a broadcast protocol does not "force" a correct process
to halt. In other words, if e "forces" a process p to halt at time c, then p failed
by the time the halt-decision function was evaluated at time c; i.e., p failed by time
c --H 1, or p failed to receive messages at time c.
Definition Broadcast protocol 0- respects correctness in comm?nication history C --H
((P,?),(O,ae,t,p,?)) if for all p and c svch that ??(p,c,?e(p,c),p(p,c)) = I,
either p ? ?cc?l, or p does not receive correctly at c in C.
The following definitions state what it means for a broadcast protocol to imple-
ment a broadcast specification in a system.
Definition Broadcast protocol 0- implements broadcast specification ?B in system 5
if for all C = ((P, ?), (?, oe, ?, p, T)) e 5, ?B((P, ?), ?c) and e respects correctness
in C
Definition Broadcast protocol 0 implements broadcast specification ?B and ensires
x- consistent (x- contamination-free) behavior in 5 if
o+ 0- implements ?B in 5, and
o+ for all commttnication histories C e 5 svch that C vses e, c is X-consistent
(x-contamination-free) with respect to ?B and 5.
6.7 Solving a problem
We have already defined what it means for an application protocol to solve a problem
using a broadcast specification (Chapter 2). We now define what it means for an
application protocol to solve a problem using a particular broadcast protocol in a
system.
Definition Application protocol II solves problem ?A ltsin9 broadcast protocol e in
systemS if and only ifVE = ((ll,a,P,?), (?,ae,?,p,r)) e 5, ?A((QP,S),?E).
Finally, we prove the following theorem that relates the two concepts mentioned
above: solving a problem using a broadcast specification, and solving a problem
using a broadcast protocol in a system.
85
Theorem 6.4 If an application protocol II solves a problem ?A using a broad-
cast specification ?B assuming X-consistent (x-contamination-free) behavior, and
a broadcast protocol e implements ?B and ensures x-consistent (x-contamination-
free) behavior in some system 5, then II solves ?A using e in 5.
Proof: Suppose II solves ?A using ?B assuming x-consistent (x-contamination-
free) behavior, and e implements ?B and ensures X-consistent (X-contamination-
free) behavior in 5.
Suppose E = ((II, ? p, S), (e, oe, tL? P, r)) is any extended histories of II using 0-
in system 5. Let C, H, A and B be E's communication, protocol, application and
broadcast histories respectively. To prove the theorem, we must show that ?A(A, FE)
is satisfied.
Since E ? 5, C ? 5. Since 0- implements ?B, ?B(B, Fc). Since e also ensures
x-consistent (x-contamination-free) behavior, C is X-consistent (x-contamination-
free) with respect to ?B and 5. By Lemma 6.3, A is X-consistent tx-contamination-
free) with respect to ?B and Fo.
Since E ? 5, H ? 5. Since II solves ?A using ?B assuming x-consistent (x-
contamination-free) behavior, for all failure sets F, if ?B(B, F) and A is X-consistent
(x-contamination-free) with respect to ?B and F, then ?A(A, F) is satisfied.
Since ?B(B, Fc) is satisfied, and A is X-consistent (x-contamination-free) with
respect to ?B and Fc, ?A(A, Fc) is satisfied. Recall FE = Fc, and hence ?A(A, FE)
is satisfied.
We have shown that II solves ?A using e in 5, thus proving the theorem. E)
Chapter 7
Atomic Broadcast Protocols
In Chapter 5, we derived definitions of inconsistency and contamination with respect
to atomic broadcast. In this chapter, we present atomic broadcast protocols that
prevent inconsistency and/or contamination in systems where processes are subject
to omission failures.
7.1 Preliminaries
When informally describing a protocol, we "follow" the broadcast of a particular
message m, by describing all the actions taken by the broadcaster of m (to broadcast
m), and the actions taken by any other process to deliver m.
A broadcast protocol is formally described using the four functions provided by
a broadcast protocol--Hthe delivery function, the sending function, the transition
function and the halt-decision function (See Chapter 6). However, for simplicity, we
avoid such a formal description.
We describe a broadcast protocol operationally (in pseudo-code), by giving the
actions taken by any process p at any instant of clock time c. In general, at time c,
p receives all incoming point-to-point messages, determines whether or not to halt,
1 delivers messages, and then sends point-to-point messages. To make the protocols
more readable, we choose to follow a more "natural" presentation; for example, by
giving the pseudo-code required to broadcast a message before giving the pseudo-
code required to deliver a message. However, the actual sequence of operations is as
outlined above and formally defined in Figure 6.1 in Chapter 6.
To prove that a broadcast protocol implements broadcast specification ?B in a
1Recall that a broadcast protocol is permitted to force a faulty process to halt
before it becomes inconsistent.
86
87
system S, we prove that for all primitive histories C of the protocol in system S, the
corresponding broadcast history B satisfies ?B with respect to the processes that
are actually faulty; i.e., ?B(B, Jc) is satisfied. In this context, a faulty process p is
one that is in ?c, and a correct process q is one that is not in ?c.
7.1.1 Communication
Recall that in the extended model (Chapter 6), we assumed that processes commu-
nicate via reliable message passing (Section 6.2). We write "p sends m to q at time
c" when we mean that p sends the point-to-point message (p, q, m, c) at time c. We
write "p sends m to all at time c" when we mean that for all processes q, p sends the
point-to-point messages (p, q, m,
Without loss of generality, we assume that the message sends and receives obey
the following fifolinks property:
Fifo-links: If p sends m to q before p sends m' to q, and q receives m' at time
??, then q receives m at time c ?
If p sends m and m' to q at time c, and q receives m' at time c', then q also
receives m at time c'.
Using "piggybacking" techniques and/or sequence numbers, it is straightforward to
implement send and receive primitives with the above fifo-links property.
To simplify the presentation, our atomic broadcast protocols use (as a subroutine)
a fault-tolerant broadcast called a basic broadcast. Formally, a basic broadcast is a
timed reliable broadcast that also satisfies the following uniform fifo order property:
Uniform Fifo Order: If any process p broadcasts a message m before it broad-
casts a message m', and any process q delivers m', then q delivers m before
Note that p and q may both be faulty in the above definition.
Thus, a basic broadcast is one that satisfies the validity, agreement, integrity,
A-timeliness and uniform fifo order properties.
It is straightforward to implement a basic broadcast with a latency of A = (f+1)A
in a system subject to omission failures, where A is the link delay and f is the
maximum number of faulty processes in the system.
To avoid possible confusion (since we use more than one type of broadcast in
our protocols), we "tag" each broadcast and delivery with the broadcast type--H
for example, we write broadcast (atomic, m) to mean that m is broadcast using an
atomic broadcast protocol.
88
When informally describing an atomic broadcast protocol, or in a proof of cor-
rectness, we sometimes write "p atomically broadcasts (delivers) m" instead of "p
broadcasts (delivers) m using an atomic broadcast;" similarly we sometimes write
"p basic broadcasts (delivers) m" instead of "p broadcasts (delivers) m using a basic
broadcast."
7.1.2 Overview of results
We first present a simple atomic broadcast protocol. This protocol has a latency of
A; that is, if a process atomically delivers a message m at time c in any execution
of the protocol, then c --H ts(m) < A.
We present an atomic broadcast protocol that ensures BD-contamination-free be-
havior, and has a latency of A (which is known to be optimal). 2 We also show how
to reduce the message complexity of this protocol. Indeed, similar message reduc-
tion techniques can be used to reduce the message complexity of all the subsequent
protocols.
Recall that the prevention of BD-contamination is a more stringent restriction
than the prevention of other forms of contamination. Therefore, our protocols pre-
vent D-contamination, VBD-contamination and BD-contamination with respect to
atomic broadcast.
We then consider the problem of inconsistency, and present a protocol that en-
sures VED-consistent behavior with respect to atomic broadcast. This protocol has a
latency of A t A. Based on this protocol, we develop another protocol that prevents
VED-inconsistency with respect to atomic broadcast, and has optimal latency of A.
Clearly, the above protocols also ensure D-consistent behavior with respect to
atomic broadcast. However, they do not ensure BD-consistent behavior with respect
to atomic broadcast. Indeed, we prove that ED-consistent behavior with respect
to atomic broadcast cannot be enforced by a "non-blocking" protocol in a system
subject to omission failures.
Since we concentrate on ED-contamination and on VED-consistency, we use "con-
tamination" to mean ED-contamination, and we use "consistency" to mean VED-
consistency in this chapter.
2Although the protocol has a latency of A, it does not prevent contamination
with respect to timed atomic broadcast. The latency only serves as a measure of
efficiency of the protocol.
89
7.2 Atomic broadcast
The simple atomic broadcast protocol shown in Figure 7.1 is well-known [Lam84].
The protocol in Figure 7.1 is informally described below, by "following" the
broadcast of a message m by a process p. To atomically broadcast a message m --H
c, data) at time c, process p broadcasts (basic, m). If by time c + A, a process
q delivers (basic, m), then q atomically delivers m at time c + A (and at no other
time). Furthermore, if q atomically delivers another message ?? at time c + A,
then q atomically delivers m before m1 if and only if m precedes m' in lexicographic
orde?t.e., an order that corresponds to the timestamp of messages, with ties broken
using process names and message contents. Such an ordering of messages is more
commonly known as timestamp order.
The protocol also ensures the additional property that all processes which atom-
ically deliver a message m do so at the same instant of local time; this is called the
simultaneity property:
o+ Simultaneity: If processes p and q deliver message m, then they both deliver
m at the same local time.
1* p atomically broadcasts a message m = (p, c, data) at time c
To broadcast (atomic, m = (p, c, data)) p broadcasts (basic, m)
p delivers basic broadcast messages at c
MpC = fm ts(m) = c --H A, p delivered (basic, rn)?
p atomically delivers messages at c
D =--H sort MpC in timestamp order
p delivers (atomic, D)
Figure 7.1: Atomic broadcast protocol
Executed by process p at time c
Formally, the protocol is described in pseudo-code in Figure 7.1 by giving the
actions taken by any process p at any instant of clock time c. If p wishes to atomically
broadcast a message m, p broadcasts (basic, m). Let MpC denote the set of messages
timestamped c --H A that p has delivered using the basic broadcast. Process p sorts
90
this set of messages in timestamp order (i.e., lexicographic order), and atomically
delivers this sorted sequence of messages.
The properties provided by the basic broadcast imply that the protocol guarantees
the validity, agreement, integrity and total order properties, and hence the protocol
is an atomic broadcast protocol. It is also straightforward to prove that the protocol
satisfies the A-timeliness property (i.e., has a latency of A), and guarantees message
delivery in timestamp order.
Theorem 7.1 The protocol in Figure 7.1 is an atomic broadcast protocol that toler-
ates omission failures, and has a latency of A.
7.3 Atomic broadcast with no contamination
We now present an atomic broadcast protocol that ensures BD-contamination-freedom
(Figure 7.2). We then describe an optimization to reduce the message complexity of
the protocol.
To atomically broadcast m = (p, c, data) at c, process p broadcasts (basic, b),
where b is the tuple b = (p, c, (data, D?vDpC)), and D?vDpC is the sequence of messages
that p has atomically delivered by c. If process q delivers (basic, b) by time c + A,
then q atomically delivers (p, c, data) at c+ A if and only if DLVDCq = DLVDpC In other
words, q atomically delivers m only if p and q atomically deliver the same sequence
of messages by time ts(m).
We show that this protocol is an atomic broadcast protocol--Hi.e., it satisfies the
validity (Lemma 7.5), agreement, total order (Lemma 7.4), and integrity properties.
Since the protocol also satisfies the fifo order (Lemma 7.8) property, and the delivery
of message = (q, c', --H) by p is conditional upon whether ? D?vDp (Lemma
7.7), Theorem 5.11 in Chapter 5 implies that the protocol prevents BD-contamination
with respect to atomic broadcast.
Lemma 7.2 If any process atomically delivers m, then it atomically delivers m at
time ts(m) + A.
Lemma 7.3 If a correct process atomically delivers m, then all correct processes
atomically deliver m.
Proof: Suppose for contradiction that the lemma is false. Let c be the earliest
time such that some correct process q atomically delivers a message m at time c + A,
m was broadcast at time c, and another correct process r does not atomically deliver
m. Thus, q and r atomically delivered the same set of messages by time c, and
Lemma 7.2 implies that D?vDrC = D?vDCq
91
p atomically broadcasts a message m = (p, c, data) at time c *1
To broadcast (atomic, m = (p, c, data))
p broadcasts (basic, b = (p, c, (data, DT+VDpC)))
/* p delivers basic broadcast messages at c
MpC =--H ?b ts(b) = c --H A, p delivered (basic, b)?
1* p atomically delivers messages at c
T --H= ?(q, c --H A, data') (q, c --H A, (data', DU+VDCq?A) ? Mf,
p			q
D =--H sort T in timestamp order
p delivers (atomic, D)
Figure 7.2: Atomic broadcast protocol that prevents BD-contamination
Executed by p at c
Suppose p is the broadcaster of m (i.e., p = bc(m)). Since q atomically delivered
m, q delivered (basic, b), b = (p, c, (rn, D?vDpC)) by time c + A, and D?vDpC = DLVDCq
Since q and r are correct, the properties of broadcast (basic, b) and deliver (basic, b)
ensure that also delivered (basic, b) by time c + A. Since r did not atomically
deliver m, DLVDpC $ D?VDrC Thus, D?VDrC $ D?vDqC, a contradiction. c
Lemma 7.4 For all correct processes p and q, and all times c, D?vDpC = D?VDCq
Proof: From Lemmas 7.2 and 7.3.
0
Lemma 7.5 If the broadcaster of a message m is correct, then all correct processes
atomicallV deliver m.
Proof: It is clear from the protocol that if a correct process p atomically broad-
casts m, then p atomically delivers m. The lemma follows from Lemma 7.3 0
Lemma 7.6 is a consequence of the fact that any process p can only atomically
deliver m at time ts(m) + A, and at no other time.
92
Lemma 7.6 If p and q atomically deliver messages m and m', then p atomically
delivers m before m' if and only if q atomically delivers m before m1.
Lemma 7.7 For all p, q and times c, if D?vDCq = D?vDpC then D?vDCq ? DT+VDp
Lemma 7.8 If p atomically broadcasts m before m', and any process q atomically
delivers m1, then q atomically delivers m before m'
Proof: The proof is by contradiction. Suppose for contradiction that p atomically
broadcasts m before m' (i.e., ts(m) < ts(m?), and q atomically delivers m', but does
not atomically deliver m by the time it atomically delivers m1. Let c denote ts(m)
and c' denote ts(m1).
Since q atomically delivers m' at c' + A, DT?VDpC' = D?vDCq' Since c < c' Lemma
7.2 implies that D?vDpC = D?vDCq
Since q atomically delivers m', q delivers (basic, m') at or before time c' + A.
Hence, by the fifo order and A-timeliness properties satisfied by basic broadcast and
delivery, q delivers (basic, m) by time c + A. Since q did not atomically deliver m,
we conclude that D?vDpC $ D?vDCq			This contradiction completes the proof of the
lemma.			0
Theorem 7.9 The protocol in Figure 7.2 is an atomic broadcast protocol that pre-
vents the contamination of correct processes, tolerates omission failures and has a
latency of A.
Proof: It is clear that the integrity property is satisfied. Hence Lemmas 7.4, 7.5,
and 7.6 imply that the protocol is an atomic broadcast protocol. Lemma 7.8 implies
that the fifo order property is satisfied. The protocol implies that if p atomically
delivers m = (q, c, ), then D?vDCq = D?vDpC, and hence (by Lemma 7.7) DU+VDqC??
DLVD?. Thus, Theorem 5.11 in Chapter 5 implies that the protocol prevents BD-
contamination with respect to atomic broadcast.
The latency of A follows directly from the protocol.			0
The simple protocol is expensive in terms of message size. However, the fifo order
and simultaneity properties allow us to reduce the size of messages. We now describe
an `mproved version of the above protocol in which the broadcaster p of a message
m at time c uses an efficient encoding of D?vDpC, as explained below.
For all processes p and all times c, let M5gspC be a vector that contains the following
information about the number of messages atomically delivered by p: for all processes
93
q let Ms95pC(q) = k if and only if, by time c, p has atomically delivered k messages
broadcast by q.
Suppose for processes p, q and r and time c, MsgspC(q) = Msgs?C(q) = k. The
fifo order of messages delivery implies that at time c, both p and r have atomically
delivered the first k messages that q atomically broadcast (and no other messages
broadcast by q). Furthermore, the simultaneity property implies that they atomically
delivered these messages at the same times. Thus, Ms9spC = Ms9srC if and only if
DLV = DLVDCr Hence, the vector MsgspC can replace the message sequence DLV wC
in the simple protocol (Figure 7.2).
If c' < c, then Ms9spC equals MsgspC' plus an increment" vector Inc?C?C', where
IncpC?C (q) is the number of messages originated by qthat p atomically delivered after
time c' and by time c. With the improved protocol, to atomically broadcast m at
time c, process p broadcasts (basic,b), where bis the tuple (p,c,(data,IncpC?C')), and
c' is the time of p's previous atomic broadcast. Thus, each message that patomically
delivers between c and c1 can increase the size of IncpC?C' by at most one bit. This is
in contrast to the broadcast (basic,(p,c,(data,DLVDpC))) required by the protocol in
Figure 7.2, where the broadcast message includes the sequence of all the messages
that p atomically delivered by time c.
7.4 Atomic broadcast with no inconsistency
In this section, we present broadcast protocols that implement atomic broadcast and
ensure VBD-consistent behavior.
In order to prevent contamination, each process q need only perform a "local"
check (e.g., comparing ?Ly'1)??ts(m) with ??`v?tqS(Tfl)) prior to atomically delivering a
message m atomically broadcast by p. Such a local check is not sufficient to prevent
inconsistency. This is illustrated by the following example.
Let p be a correct process, q be a faulty process, and c be a time such that
DLVDpC = DLVDCq Suppose p atomically broadcasts a message m at c, and subse-
quently atomically delivers some message m'before atomica$ly delivering m. Suppose
that q fails to atomically deliver m'. Although D?VDpC = DLVDqC, q cannot atomically
deliver m; if it did, it would become inconsistent because it would not have atomically
delivered m' before m.
Informally, to ensure consistency, a process patomically delivers monly if it agrees
with the correct processes on the sequence of messages that should be atomically
delivered before m. Our protocol only works if no more than half the processes in
the system can be faulty. We show that the requirement n> 2f is necessary for any
94
protocol that prevents inconsistency with respect to atomic broadcast.
Theorem 7.10 Any atomic broadcast protocot that ensnres D-consistency in a sys-
tem subject to omission failures requires n > 2f
Proof: Suppose for contradiction that the theorem is false, and that there is
an atomic broadcast protocol that ensures D-consistency in a system with f faulty
processes and n = 2f processes. Let A and B be any two sets of processes, such that
IA = IB = f, and AflB = ? Let p be any process in set A.
Suppose p atomically broadcasts a message m at some time c. We consider two
possible executions of the protocol. In both these executions, we assume that no
message is received in less than A time; thus either a message is received exactly
A time after it was sent, or it is never received. Furthermore, we assume that if a
message is supposed to be sent by a process in A to another process in A, then the
message is actually sent and is received. Similarly, if a message is supposed to be
sent by a process in B to another process in B, then the message is actually sent
and is received.
Scenario 1: Suppose the processes in A are correct, and the processes in B are
faulty. Suppose that whenever a process a in A sends a message to a process b in B,
b fails to receive the message. Suppose also that whenever a process b in B sends a
message to a process a in A, b fails to send the message. In this scenario, it is clear
that the processes in A must eventually atomically deliver m, and the processes in
B never atomically deliver m.
Scenario 2: Suppose that the processes in A are faulty, and the processes in B are
correct. Suppose that whenever a in A sends a message to b in B, a fails to send the
message; whenever b in B sends a message to a in A, a fails to receive the message.
The two scenarios are indistinguishable to the processes in A and B. Hence, as
in Scenario 1, the processes in A atomically deliver m and the processes in B never
atomically deliver m. However, in this execution (unlike the execution described in
Scenario 1), the processes in A are faulty. It is clear that by atomically delivering
m (a message that the correct processes never deliver), a process a in A becomes
D-inconsistent with respect to atomic broadcast and the failure set of A. Thus, this
execution of the atomic broadcast protocol contradicts the assumption that the pr?
tocol ensures D-consistency.			0
We now present a simple atomic broadcast protocol that prevents inconsistency
and has a latency of A t A (Figure 7.3). We then present a protocol that has a
latency of A which is optimal (Figure 7.5).
95
7.4.1 Simple protocol: latency A + A
Informally, our simple atomic broadcast protocol that prevents inconsistency works
as follows. To atomically broadcast m = (p, c, --H) at time c, p broadcasts (basic, m)
at time c.
If a process q delivers (basic, m) by time c + A, then q sends the message
m, AllMCq+A) to all processes at time c + A, where AllMCq+A denotes the set of all
messages m1, timestamped c or less, such that q delivered (basic, m').
At time c + A + A, if a process q receives at least n --H f messages of the form
m, AllMCr+A), with AllMCr+A --H AllMCq+A, then q delivers (atomic,m). In this
scenario, all the correct processes agree with q that m should be atomically delivered,
and also agree with q on the sequence of messages that should be atomically delivered
before m.
However, if q does not receive at least n --H f messages of the form described above,
then q halts. In this scenario, q is faulty and if it were to atomically deliver m, it
would become inconsistent.
The protocol is formally described in Figure 7.3. We show that the protocol is an
atomic broadcast protocol--Hi. e. it satisfies the validity, agreement, total order and
integrity properties. We also show that the protocol respects correctness (i.e., does
not halt a correct process), and ensures VBD.consistency. To do the latter, we show
that the protocol ensures both the uniform agreement and the fifo order properties,
and that the messages atomically delivered by any process are always a prefix of the
messages atomically delivered by the correct processes. Thus, Lemma 5.7 in Chapter
5 implies that the protocol prevents VBD-inconsistency.
Lemma 7.11 If any process atomically delivers a message m, then it does so at time
ts(m) + A.
Lemma 7.12 If no correct process halts before time c, then for all correct processes
q and r, AllMCq = AllMCr
Proof: From the properties of the basic broadcast, and the observation that if
a correct process halts at time c, it does so after receiving messages and delivering
basic broadcasts.
Lemma 7.13 Suppose m is atomically broadcast, and no correct process halts before
time ts(m) + A + A. If any correct process delivers (basic, m), then for every correct
process q, Rq,m > n --H
96
p atomically broadcasts a message m = (p, c, data) at time c
To broadcast (atomic, m = (p, c, data)) p broadcasts (basic, m)
p delivers basic broadcast messages at c
MpC = (m ts(m) = c --H A, p delivered (basic,m)i
AllMpC =--H Uc'<c ??
p sends messages at c *1
For all m ? MpC then p sends (p,m,AuMpC) to all
p receives all messages at c
14,m = p received (q,m, AUMqC?A), m ? MpC?A, AtlMpC?A = AllMc7A?
p checks whether it should halt */
If am e MpC?A such that IJ4,mI ? n --H f then p halts
1* p atomically delivers messages at c
D =--H sort MpC?A in timestamp order
p delivers (atomic, D)
Figure 7.3: Atomic broadcast protocol that prevents VBD-inconsistency
Executed by process p at time c
Proof: Suppose some correct process delivers (basic, m). From the properties
of the basic broadcast, all correct processes deliver (basic, m) by time c, where c
denotes the time ts(m) + A.
Let q be any correct process and let A denote the set of messages AUMCq
Since every correct process delivers (basic, m) by time c, the protocol and Lemma
7.12 imply that every correct process sends the message (m, --H,A) to all processes
at time c.
Since (by hypothesis) no correct process halts before time c + A, any correct pro-
cess that halts, does so after delivering basic broadcast messages at time c + A. Since
there are at least n --H f correct processes, q receives at least n --H f messages of the
form (rn, --H,A) by time c + A, and hence Rq,rn > n --H f. 0
97
Lemma 7.14 A correct process never halts.
Proof: Suppose for contradiction that the lemma is false. Let c be the earliest
time such that some correct process q halts at time c + A.
The protocol in Figure 7.3 implies that there is a message m, with ts(m) = cA,
such that q delivers (basic, m) by time c, and jRq,mI < n --H
Since q is correct and delivers (basic, m), Lemma 7.13 implies that IRq,mI > n--Hf.
This contradiction completes the proof.			0
Lemma 7.15 If a correct process atomically delivers message m, then all correct
processes atomically deliver m.
Proof: If a correct process atomically delivers m, then it delivers (basic, m);
thus, all correct processes deliver (basic, m). From the protocol, if any process basic
delivers a message m, then it either atomically delivers m at time ts(m) + A + A, or
it halts by time ts(m) + A + A, before atomically delivering any messages at time
ts(m) + A + A. Lemma 7.14 states that no correct process halts. Thus, we conclude
that all correct processes atomically deliver m at time ts(m) + A. 0
Lemma 7.16 For all correct processes p and q, and all times c, D?vDpC = D?vDCq
Proof: Follows from Lemmas 7.15 and 7.14.
0
Lemma 7.17 If q delivers (basic, m), and q does not atomically deliver m, then q
halts at or before ts(m) + A + A.
Proof:			Directly from the protocol.			0
Lemma 7.18 If a correct process atomically broadcasts message m, then all correct
processes atomically deliver m.
Proof: Suppose a correct process p atomically broadcasts a message m at time
c. From the protocol, p broadcasts (basic, m) at time c. The properties of the basic
broadcast imply that all correct processes deliver (basic, m).
Let q be any correct process. Since q is correct, and hence cannot halt (Lemma
7.14), Lemma 7.17 implies that q atomically delivers m. By Lemma 7.15 all correct
processes atomically deliver m.			0
98
Lemma 7.19 If p atomically broadcasts m before m', and any process q atomicatty
delivers m1, then q atomically delivers m before m'.
Proof: The proof is by contradiction. Suppose for contradiction that p atomically
broadcasts m before m' (i.e., ts(m) < ts(m?)), and q atomically delivers m', but does
not atomically deliver m by the time it atomically delivers m'. Let c denote ts(m)+A,
and c' denote ts(m') + A. From the protocol, q atomically delivers m' at c' + A.
Since p atomically broadcasts m before m', p broadcasts (basic, m) before p
broadcasts (basic, m'). Since q atomically delivers ??, q delivers (basic, m') at or
before time c'. The fifo and A-timeliness properties satisfied by the basic broadcast
imply that q delivers (basic, m) by time c.
Since q did not atomically deliver m, Lemma 7.17 implies that q halts at or before
c + A. Hence q does not atomically deliver any messages at or after time c + A and, in
particular, q does not atomically deliver m' at c' + A. This contradiction completes
the proof.			c
Lemma 7.20 If any process atomically delivers m, then ati correct processes atom-
ically deliver m.
Proof: Suppose some process q atomically delivers m. From the protocol,
IRq,mI > n --H f. Since n > 2f, there is at least one correct process r in Rq?m.
Since r ? Rq,m, r delivered (basic, m). The agreement property of the basic
broadcast implies that all correct processes also deliver (basic, m). Since correct
processes do not halt (Lemma 7.14), Lemma 7.17 implies that all correct processes
atomically deliver m.			c
Lemma 7.21 For all q, for all c, for all correct r, D?vDCq ? D?vDr
Proof: The proof is by contradiction. Suppose q is any process, r is a correct
process, and c is any time. Since the protocol ensures the simultaneity property,
DLVDqC ? DLVDr only if q skips a message; i.e., r atomically delivers a message m
before a message m1, and q atomically delivers m1, but does not atomically deliver
m. Assume for contradiction that this is the case.
Let c denote ts(m) + A and c' denote ts(m') + A. Since r atomically delivers m
before m' c < c'.
Since c < c' and q did not atomically deliver m and atomically delivered m'
Lemma 7.17 implies that q does not deliver (basic, m); hence m ? AllMCq
99
Since r atomically delivers m before m', r delivers (basic, m), and hence m ?
AllMrC Since correct pr,ocesses do not halt, Lemma 7.12 implies that for all correct
processes 5, m ? AllMC8. Thus at least n --H f processes do not agree with q's value
of ?llA??.
Since Rq,m? only consists of (a subset of) the processes that agree with q's value
of AflMCqI, and since n> 2f, Rq,in? is less than n --H f. Hence, q halts before atomi-
cally delivering m'. This is a contradiction.			0
Theorem 7.22 Suppose n > 2f. The protocol in Fi?ure 7.3 is an atomic broadcast
protocol that ensures VBD-inconsistent behavior, tolerates omission failures and has
a latency ofA+A.
Proof: It is clear that the integrity property is satisfied. Lemmas 7.18 and 7.16
imply that the validity, agreement, and total order properties are satisfied. Lemma
7.14 shows that the protocol respects correctness. Lemmas 7.19 and 7.20 imply that
the fifo order and uniform agreement properties are satisfied. Since Lemma 7.21
shows that for all q, for all c, for all correct processes r, D?vDqC ? DU+VDr, Lemma 5.6
in Chapter 5 implies that that no process becomes VBD-inconsistent. This completes
the proof.			0
7.4.2 Optimal protocol: latency A
The simple protocol described above consists of two parts. In the first part, a process
p broadcasts (basic, m) to all processes. The second part of the protocol begins at
time ts(m) + A. At that time, every process q is supposed to send an "acknowledge-
ment" which includes information about all the messages that q intends to atomically
deliver before m. These acknowledgements are received by the other processes by
time ts(m) + A + A. Based on the acknowledgements it receives, a process decides
whether to atomically deliver m or to halt.
This section describes an optimization of the protocol given in Figure 7.3. The
new protocol is optimal in that it permits a process to decide whether to atomically
deliver a message m at time ts(m) + A; that is, unlike the simple protocol which has
a latency of A + A, the new protocol has an optimal latency of A.
As in the simple protocol, the optimal protocol also consists of two parts, taking
A and A time respectively. The difference is that in the broadcast of a message
m, the second part of the optimal protocol begins at time ts(m) + A --H A, not at
time ts(m) + A (as in the simple protocol). Thus, the two parts of the protocol are
"pipelined" and the entire protocol takes at most A time to terminate.
loo
The optimal protocol assumes that f, the maximum number of faulty processes
in the system, is greater than 1. If this is not the case, then there are other (simpler)
atomic broadcast protocols that ensure consistency.
Suppose p wants to atomically broadcast m = (p, c, data) at time c. The first
part of the protocol, called a witness broadcast, is used to "disseminate" the message
? to all processes. Furthermore, each process q determines a set of witnesses to
the broadcast of m. Informally, a process r is a witness to the broadcast of m, if
r "believes" that p correctly disseminated the message m. At the end of the first
part of the protocol, all the correct processes agree on whether the cardinality of the
witness set for m is less than n --H
In the second part of the protocol, a process uses its witness sets to determine
if it should continue with message delivery or if it should halt before becoming
inconsistent. (Compare this with the AilM sets used in the simple protocol). The
second part of the protocol is explained in more detail later on in the text.
Witness broadcast
Any process p may witness-broadcast message m = (p, c, data) at any time c; this is
denoted p broadcasts (witness, m) at time c.
Informally, a witness broadcast is a basic broadcast, with the addition that every
process q maintains a monotone witness set for each message m that it witness-
delivers. Process q's witness set for a message m at time c is the set of processes
that, according to q's knowledge at time c, certify that the broadcaster of the message
m correctly broadcast m at time ts(m).
Let WitpC,m denotes p's witness set for message m at time c, immediately before
p's clock changes to c + 1. Informally, for any message m, at time ts(m) + A, the
correct processes agree on whether the cardinality of the witness set for m is at least
n --H f. Furthermore, at time ts(m) + A, if any process (correct or faulty) determines
that the cardinality of the witness set is at least n --H f, then all correct processes
witness-deliver message m by time ts(m) + A --H A.
For simplicity of presentation, we assume the following. Suppose a process p
crashes at time c. For all messages m, for all c' > c p's witness set for m at time c'
is identical to its witness set for m immediately before it crashed.
Informally, to witness-broadcast a message m = (p, c, data) at time c, p sends
(p, m) to all processes. Process p also immediately adds itself to WitpC,m. When p
first receives a message of the form (r, m), it adds r to its witness set for m; note
that it does not relay (r, m).
101
p witness-broadcasts a message m = (p, c, data) at time c *1
To broadcast (witness, m = (p, c, data))
p sends (p,m) to all
WitpC,m =
p witness-delivers its own broadcast ?/
For all m : p witness-broadcast m at c --H 1, p delivers (witness, m)
p receives all messages at c
ApC = ?(q,m) p received (q,m)?
RpC = (mi (--H,m) e ApC?
1* p updates its witness sets for messages it broadcast ?/
For all (q,m) ? ApC
if c ? ts(m) + A and p = bc(m) then WitpC,m = WitpC,mU fq?
p witness-delivers messages broadcast by other processes
For all m E RpC if c < ts(m) + A and Wit%1 undefined then
/* Note that p # bc(m) *1
p delivers (witness, m)
WitpC?m = ?bc(m)?
if c< ts(m)+A then
witc = witc
p,m			p,m			P
p sends (p, m) to all
p relays messages for messages broadcast by other processes
For all (q,m) ? ApC
if c ? ts(m) + A, p $ bc(m) and p has not sent (q, m) before then
WitpC,m = WitpC,mU (q?
p sends (q,m) to all
Figure 7.4: Witness broadcast protocol
Executed by process p at time c
102
When a process q $ p first receives a message of the form (r, m) by time c + A, it
adds p and r to its witness sets for m, and relays (r, m) to all processes. Furthermore,
if q $ p first receives a message of the form (r, m) at some time c? < c + A, then at
time c', q adds itself to its own witness set, and sends (q,m) to all processes.
The protocol is more formally described in Figure 7.4, and satisfies the following
lemmas. For simplicity, when the set WitpC,m is undefined, we say that WitpC,rn =
Lemmas 7.23, 7.24 and 7.25 follow directly from the protocol.
Lemma 7.23 For all messages m, for all processes p $ bc(m), for all time c, p
delivers (witness, m) at time c if and only if the broadcaster of m is first added to
p's witness set at time c; i.e., bc(m) ? WitpC,rn and bc(m) ? witpc;ml.
Lemma 7.24 For all messages m, for all processes p, for all time c, the first member
of p's witness set form is the broadcaster of m; i.e., if WitpC,m $ ?, then bc(m) ?
witc
pm
Lemma 7.25 For all processes p, for all messages m, for all time c, the set WitpC?m
satisfies the following conditions:
o+ Suppose p = bc(m). Process p joins its own witness set form when it witness-
broadcasts m; i.e., p = bc(m) and p broadcasts (witness,m) at c if and only if
p ? WitpC? and p ? Witp%m1
o+ Suppose p $ bc(m). If p witness-delivers m at time c < ts(m) + A, then p
joins its witness set for m at time c; i.e., p $ bc(m) delivers (witness,m) at
c ? ts(m) + A if and only if p ? WitpC?m andp ? witpc:ml.
Lemma 7.26 For all messages m, for all correct processes p $ bc(m), for all correct
processes q, such that neither p nor q process halts by ts(m) + A, for all time c, the
following condition holds. If some process r is in p's witness set for m at timc
c < ts(m) + A --H A, then r will be in q's witness set form by time c + A; i.e., for all
r, if c ? ts(m) + A --H A and r ? WitpC,rn, then r E Wiffq+,?A
Proof: Suppose p and q are correct processes. For some message m suppose that
c is the smallest time by ts(m) + A --H A, such that for some process r, r ? witpc,m;
thus, r ? WitpC?ml The proof is a case analysis, and we will only prove one of the
cases (r bc(m)); the other case (r $ bc(m)) is very similar to the one shown.
Case: r = bc(m): In this case, p first received a message of the form (--H, m) at
time c. Thus, p relays this message to all processes. Since p and q are correct, and the
link delay is bounded by A, q receives the relay by time c+A. Since c+A ? ts(m)+A,
we conclude that r ? WitqC+,?A
103
u
Lemma 7.27 below is proved by a "pigeonhole" argument. This technique will be
used in the proof of several results in this chapter and in Chapter 8.
Lemma 7.27 If a process p witness-broadcast a message m, and some process r first
receives (p, m) at time c > ts(m) + iA for some integer i, then at least i + 1 processes
sent the message (p,m) by time ts(m) + iA.
Proof: The proof is a "pigeonhole" argument, and is by induction.
For all processes q, if q receives m, let cq denote the time when q first receives
the message m; otherwise let cq = 00.
Basis: Suppose i = 0. The claim asserts that at least 1 process has sent (p, m) by
time ts(m). Since p witness-broadcast m at time ts(m), and r received (p, m), we
conclude from the protocol that p sent (p, m) at time ts(m).
Induction Hypothesis: Suppose the claim is true for i = i*, for some i* > 0.
Induction: Suppose i = i* + 1. Since c > ts(m) + iA, process r first receives (p,m)
after time ts(m)+iA. Let q be the first process that receives m after timets(m)+iA;
t.e., cq > ts(m) + iA, and for all processes s, either Cj < ts(m) + iA or c8 > cq.
Suppose q first receives (p, m) from some process s. From the protocol, when s
first receives (p,m) (i.e., at time c.), it sends (p,m) to all processes, and 5 sends
(p, m) at no other time.
Since message receipt does not take 0 time, c8 < cq, and thus c8 < ts(m) + iA.
Message link delay is bounded by A, and hence c8 + A > cq. Since cq > ts(m) + iA,
c8 > ts(m) + (i --H 1)A.
Thus, by the induction hypothesis, at least i processes sent the message (p, m)
by time ts(m) + (i --H 1)A. Since 5 sent (p,m) after time ts(m) + (i --H 1)A (and by
time ts(m) + iA), and 5 does not send (p, m) more than once, we conclude that at
least i + 1 processes, not including r have sent (p, m) by time ts(m) + iA. 0
Lemma 7.28 Suppose some process r first enters a process p's witness set for a
message m at some time c > ts(m) + kA, for some integer k. Then, at least k
processes, not including the broadcaster of m, have r in their witness sets by time
ts(m) + kA.
That is, if r E WitpC,rn and r ? Witp%rn1 and c > ts(m) + kA, then jfq q $
bc(m),r ? wittqs?(m)+?A?j >--H k
104
Proof: Suppose some process r first enters a process p's witness set for a message
m at some time c > ts(m) + kA, for some integer k. The proof is a case analy-
sis. If r = bc(m), then the lemma follows from Lemma 7.27. If r $ bc(m), then the
proof is by a "pigeonhole" argument, similar to the one in the proof of Lemma 7.27. 0
Lemma 7.29 Suppose a process p sends a point-to-point message to another process
q at some time c, and that q receives this message. For all messages m1 that p did
not witness-broadcast, for all c' < c, WitpClmI C witc'+A.
--H			q,mI
Proof: Suppose p, q and c are as in the statement of the lemma. Let m' be any
message that p did not witness-broadcast. Let c' be the smallest time such that for
some process r, r ? WitpC',nt The proof is complete if we show that r e
If r is the broadcaster of m', then p sends a message of the form (--H, m') to all
processes at time c'. If r did not broadcast m1, then p sends a message of the form
(r, m') at time c'. In either case, since c' < c, and point-to-point message passing is
assumed to satisfy the fifo-link property, q receives the message p sent at time c' by
the time it receives the ,message sent at time c. Since link delay is bounded by A, we
conclude that r ? WitCq,+?A1			0
Lemma 7.30 If a correct process p witness-broadcasts m at time c, and no correct
process hatts before time c + 2A, then for all correct processes q, I WitqC+,2A? > n --H f
Proof:
Suppose a correct process p witness-broadcasts m at time c, and no correct
process halts before time c + 2A.
Let q be any correct process. From Lemma 7.23 and the monotonicity of the
witness sets, p ? WitpC+,,nA and q C Witpc$mA Since q does not halt before time c + 2A,
Lemma 7.26 implies that for all correct processes r, r E WitCq$?2A Since there are
n --H f correct processes, I WitqC+?2A > n --H f. 0
Lemma 7.31 Suppose that no correct process halts before time ts(m) + A. If p and
q are correct, then Witpt8,m(1fl)+AI > n --H f if and only if WitqtS,?(Tfl)+A > n --H f
Proof: The proof is a case analysis, where c denotes ts(m).
Case: bc(m) is correct: From Lemma 7.30, since the broadcaster of message m is
correct, for all correct processes p, IWitpC+,m2AI > n--Hf. Since A = (f+1)A, and f>1
(by assumption), A > 3A. Thus, the lemma follows because no correct process halts
before time ts(m) + A, and the witness sets are monotone.
105
Case: bc(m) is faulty: Let p be any correct process such that some process
s e WitpC+,4 Let q be any correct process. The lemma follows if we show that
s ? WitC+A.
q,m
If s ? WitpC+,mA?A, then from Lemma 7.26, 5 e WitCq$4 Hence, suppose 5
WitC+A?A
p,m			c+A
Since 5 Ei Witp,rn ,and A = (f+1)A, Lemma7.28 shows that ifr Jr $ bc(m),s Ei
WitrC$mA?A?I > f. Since bc(m) is faulty, there is at least one correct process r, such
that s ? WitCr+,mA?A By Lemma 7.26, s ? WitCq$?A This completes the proof. 0
Lemma 7.32 Ifp $ bc(m) is correct c < A --H A, and no correct process halts before
time c + A, then for all correct processes q, WitpC,rn C WitCq+?A
Proof:			Follows from Lemma 7.26.			0
Lemma 7.33 S?ppose a faulty process witness-broadcasts m. Let 5 be such that
for all correct processes p, 5 ? WitptSin(TTL)+A?2A Let q be any faulty process, q $
bc(m). If either s ? WittqS?(?)+A, or 5 ? Witpt8,m(?)+A for some correct p, then 5
Wittq5?(?)+A?2A
Proof: Let 5 be such that for all correct processes p, 5 ? WitpC+,vnA?2A, where
c = ts(m). Let q be any faulty process, q $ bc(m). The proof is a case analysis.
Case: 5 ? WitCq+,4: Let c* denote the smallest time such that 5 e WitCq,?
Suppose for contradiction that c* > A --H 2A.
Since 5 ? WitCq+,?A?2A and A = (f + 1)A, Lemma 7.28 implies that the set X --H
fr r $ bc(m), 5 e WitrC,+inA?2Al is such that JXJ > f --H 1. Since (by hypothesis) for
all correct processes p, s ? WitpC$mA?2A, p ? X. Thus, only faulty processes are in
the set X. Since bc(m) is faulty, there are only f --H 1 faulty processes not including
bc(m). Since JXJ > f --H 1, we conclude that for all faulty processes r $ bc(m),
5 ? Witrc+,mA?2A; in particular 5 ? WitCq+,?A?2A This is a contradiction.
Case: 5 ? WitpC+,4 for some correct p: Since 5 ? WitpC+,4?2A and A = (f + 1)A,
Lemma 7.28 implies that fr r $ bc(m),s Ei WitrC+,mA?2A? > f --H 1. Since bc(m) is
faulty, there are only f --H 1 faulty processes not including bc(m); thus, for all faulty
processes r $ bc(m), 5 Ei WitCr+mA?2A Since q is a faulty process, 5 E Witc+A--H2A
Lemma 7.34 If for some p, WitptSm(?)+AI > n --H f, and no correct process halts by
time ts(m) + 2A, then for all correct processes q, q delivers (witness, m) by time
ts(m) + 2A < ts(m) + A --H A.
106
Proof: If bc(m) is correct, then the lemma is trivially true. Suppose bc(m) is
faulty. Lemma 7.25 implies that at least one correct process delivers (witness, m)
by time ts(m) + A, and Lemma 7.26 implies that all correct processes witness.deliver
m by time ts(m) + 2A.
Since f > 1 and A = (f + 1)A, 2A < A --H A. Thus, all correct processes
deliver (witness,m) by time ts(m) + A --H A. 0
Atomic broadcast protocol to prevent inconsistency
Informally, to broadcast m at time c, p broadcasts (witness, m) at time c.
At time c+ 2A, suppose p believes that there are fewer than n --H f witnesses for m;
t. e., Witpc$m2A ? n --H f. In this scenario, p failed to properly broadcast the message
m to at least one correct process, and hence to avoid becoming inconsistent, p halts.
At time c + A --H A, if a process q has already delivered (witness, m), then q sends
the message (m, q, WitpC+,n??A) to all processes.
At time c + A, if a process q has already delivered (witness, m), then:
o+ RCq+?A denotes the set of processes that appear correct to q. That is, r ? RCq+,?A if
q received r's message (m, r, WitrC+,mA?A), and r has correctly relayed its witness
sets to q.
o+ If r ? RCq$?A, and if r believes that there are fewer than n --H f witnesses to the
broadcast of m, then r ? ACq+,4; otherwise r E BqC+,?A Note that if r ? BqC+?A,
then, intuitively, q "knows" that r "believes" that p correctly broadcast m.
After computing the R, A and B sets, q uses these sets to determine whether to halt:
o+ If q = p and BqC%?Aj < n --H f then q halts. In this scenario, at least one correct
process believes that q failed to broadcast m successfully. Thus, q is indeed
faulty and, to avoid becoming inconsistent, q halts.
o+ If p ? RCq+,?A, 1WitqC$?1 n --H f and ACq+,?A? < n --H f then q halts. Since
p ? RCq+,4, p did not halt at time c + 2A, and hence there are at least n --H f
witnesses to the broadcast of m. However, q does not believe that there are
n --H f such witnesses. Since jACq+,?A <n --H f, fewer than n --H f processes agree
with q on this (i.e., fewer than n --H f processes agree with q that there are
fewer than n --H f witnesses). Hence q erroneously believes that p is faulty, and
to avoid becoming inconsistent, q halts.
o+ If WitqC$A > n--Hf and jRqC$?A <n--Hf then q halts. Process q determines that
there are at least n --H f witnesses to the broadcast of m, but also determines
that it failed to correctly relay this information to the other processes. Thus,
q is faulty and halts.
107
p atomically broadcasts a message m = (p, c, data) at time c
To broadcast (atomic, m = (p, c, data)) p broadcasts (witness, m)
p delivers basic broadcast messages at c
Mf = c = ts(m) + A, p delivered (witness,m)?
LpC = ?m c = ts(m) + A --H A, p delivered (witness,m)?
p sends messages at c
E =--H sort LpC in timestamp order
for eacb m in E in order send (m, p, WitpC,m) to all
p receives all messages at c
RpC? = p received (m, q, Witqffin?), ts(m) = c --H A and
(either p = bc(m) or WitpC?,m2A C WitCq??A)?
ApC,m = fq?q ? RpC,m and Witq%?A ? n --H
BpC?m =?i?,m ApC,m
Figure 7.5: Atomic broadcast protocol that prevents VBD-inconsistency
Executed by process p at time c
Finally, if q does not halt, then q atomically delivers m at time c + A if and only if:
o+ q determines there are at least n --H f witnesses to the broadcast of m, and
o+ q has atomically delivered all the messages that p atomically broadcast before
it broadcast m.
Figure 7.5 formally describes our atomic broadcast protocol that prevents incon-
sistency and has optimal latency.
Lemma 7.35 Stppose m is an atomically broadcast message and let c be ts(m) + A.
Svppose no correct process halts before time c. If WitpC,mI > n --H f for some process
p, then for all correct processes q, q sends (m, q, WitCq?,?A) to all at time c --H A.
Proof: Suppose I WitpC,mI > n --H f for some process p. Let q be any correct
process. Lemma 7.34 states that q delivers (witness,m) by time c--H A. Hence (from
theprotocol)q sends (m, q, WitqC$?A).			5
108
/* p checks whether it should halt */
If 3m c --H ts(m) + 2A, p = bc(m) and I WitpC?I <n --H f then halt
If ?m C MpC, p = bc(m) and IBp$mI <n --H f then halt
If 3m ? MpC, bc(m) ? RpC?m, WitpC,mI <n --H f and IApC?m( <n --H f then halt
If am E Mf, WitpC,mI > n --H f and IRpC,mI <n --H f then halt
p atomically delivers messages at c
X --H= fm m ? MpC?A,IWitpC,mI > n--Hf,
bc(m) broadcast m' before m, and p delivered (atomic, m')i.
D =--H sort X in timestamp order
p delivers (atomic, D)
Figure 7.5: Atomic broadcast protocol that prevents VBD-inconsistency
(continued)			Executed by process p at time c
Lemma 7.36 A correct process never halts.
Proof: Suppose for contradiction that the lemma is false. Let c be the earliest
clock time such that some correct process p halts at clock time c.
Case: am, ts(m) = c --H 2A, p = bc(m) and WitpC,mI < n --H f: Since p is correct,
and broadcasts (witness, m) at time c --H 2A, and no correct process halts before time
c, Lemma 7.30 states that WitpC,mI > n --H f. This is a contradiction.
Case: am ? MpC, p = bc(m) and IBpC,mI ? n --H f: Let q be any correct process.
Since p is correct, Lemma 7.30 implies that I WitqCj?A > n --H f. Lemma 7.35 shows
that q sends (m, q, WitqC??A) to all processes at time c --H A. Since both p and q are
correct, q ? BpC,m.
Since there are at least n --H f correct processes, IBpC,mI > n --H f, a contradiction.
Case: am ? MpC, bc(m) E RpC,in, I WitpC,mI < n --H f and IApC,mI < n --H f: Let
q = bc(m). Since q sends (m,q, WitqC$?A), and since A --H A > 2A, q did not halt
at time ts(m) + 2A, and hence WittqS,?)+2? > n --H f. Since the witness sets are
monotone, WitqC,? > n --H
109
Let r be any correct process. Since no correct process halts before time c, Lemmas
7.34 and 7.35 imply that r delivers (witness, m) and sends (m, r, WitrC:n?) to all
processes by time c --H A. Since WitpC,mI ? n --H f and p is correct, Lemma 7.31 states
that for all correct processes r, WitrC,m I < n --H
Since both p and r are correct, process p receives (m, r, WitrCTn?) (from r) by time
c, with Witr%ii?I ? n --H f. Furthermore, Lemma 7.32 implies that r ? RpC,,n, and also
r C ApC,m
Since there are at least n --H f correct processes, IApC,mI > n --H f, a contradiction.
Case: Bm ? MpC,IWitpC,mI > n--Hf and IRpC,mI <n--Hf: Since IWitpC,mI > n--Hf,
from Lemma 7.35, each correct process r sends (m, r, WitrC:mA) By Lemmas 7.32 and
7.26, IRpC,mI > n --H f. This is a contradiction.
5
Lemma 7.37 If a correct process atomically broadcasts m then, all correct processes
eventually atomically deliver m.
Proof: Suppose for contradiction that the lemma is false. Let m be the first
message in timestamp order that is atomically broadcast by a correct process, and
not atomically delivered by a correct process p.
By choice of m, p atomically delivers all messages m? that bc(m) had atomically
broadcast before m. Since bc(m) is correct, witptsm(m)+AI > n --H f. Since no correct
process halts, p delivers (atomic,m) at time ts(m) + A. This is a contradiction. 5
Lemma 7.38 If a faulty process atomically broadcasts m, and some process p atom-
ically delivers m, then all correct processes atomically deliver m.
Proof: Suppose for contradiction that the lemma is false. Let m be the first mes-
sage in timestamp order that is atomically broadcast by a faulty process, such that
some process p delivers (atomic, m), and a correct process q does not deliver (atomic, m).
Let c denote ts(m) + A.
Since p delivers (atomic, m), the protocol shows that p atomically delivers all
the messages m' that bc(m) broadcast before m. Thus, by choice of m, all messages
m? that bc(m) broadcast before m were atomically delivered by both p and q.
Since p atomically delivers m, WitpC,mI > n --H f. Lemma 7.36 states that no
correct process halts. Lemmas 7.34 implies that all correct processes witness-deliver
m by time A --H A. Since q is correct, m C MqC?A Since q does not atomically deliver
m, jWitqC,? <n --H
110
The proof continues as a case analysis: Case: p is correct: If p and q are
correct and WitpC,mI > n --H f, Lemma 7.31 implies that WitqC,? > n --H f. This is a
contradiction.
Case: p is faulty and p = bc(m): Let r be any correct process. Since WitCq,? <
n --H f, from Lemma 7.26,1 Witr%mAI <n --H f. Hence, r cannot be included in Bp$m
Since there are at least n --H f correct processes, at least n --H f processes cannot
be included in Since n > 2f, IB??,?I ? n --H
Since p = bc(m) and IBpC,mI < n --H f, p halts before atomically delivering m, a
contradiction.
Case: p is faulty and p $ bc(m): We claim below that WitpC,m C WitCq?? If the
claim is true, since WitpC,mI > n --H f, WitqC,? > n --H f. This contradiction then
completes the proof of the lemma.
Claim: WitpC?m C
Proof of claim: Suppose for contradiction that the claim is false. Let s be a process,
such that 5 ? WitpC,m, and 5 ? WitqC,? Thus, for all correct processes r, s ? WitrC,,n
Since p and bc(m) are faulty, and p $ bc(m), from the above and from Lemma
7.33, s ? Witpc;m2A
Let r be any correct process. Since p $ bc(m), 5 ? WitpCvn2A and s ? WitrCwmA, r
cannot be included in p's RpC,m set.
Thus, no correct process can be included in R?,?, and hence I??,mI < n --H
Thus, p halts before atomically delivering m, a contradiction. E
Lemma 7.39 If p atomically broadcasts m before m', and q atomically delivers m',
then q atomically delivers m before m'.
Proof:			Directly from the protocol.			0
Lemma 7.40 If p atomically broadcasts m, then p either atomically delivers m or
halts at time <ts(m) + A.
Proof:			Directly from the protocol.			0
Lemma 7.41 For all processes p, for all messages m, if p atomically delivers m,
then by ts(m) + A, p receives a message sent by some correct processes at time
ts(m) + A --H A.
111
Proof: Suppose a process p atomically delivers a message m. Let c denote
ts(m) + A. Thus, from the protocol I WitpC,mI > n --H f and IRpC,mI > n --H f. Since
n > 2f, there is some correct process q ? RpC,rn. Thus, from the protocol, by time c
process p receives the message (m, q, WitqC??) sent by q at time c --HA. c
Lemma 7.42 For all processes p, for all correct processes q, for all times c, DLVDpC?
DLVDq
Proof: The proof is by contradiction. Consider any process p. Let m be the
earliest message (first message in timestamp order) such that p does not atomically
deliver m, and some correct process q atomically delivers m. Suppose for contra-
diction that q atomically delivers m before some message m' and that p atomically
delivers m'. Let c denote ts(m) + A and c' denote ts(m') + A. Since q atomically
delivers m before ??, c < c'.
From the protocol and the choice of m, all messages broadcast by bc(m) before
m are atomically delivered by both q and p.
Since p does not atomically deliver m, and atomically delivers m' at c' > c,
Lemma 7.40 implies that p $ bc(m).
Since p atomically delivered m', by Lemma 7.41, p received a message that some
correct process r sent at time c' --H A. Since c < c' Lemma 7.29 implies that witcr:mA c
witc
pm
We claim below that WitpC,m > n --H f. Suppose the claim is true. Then, from
the protocol, either p halts and does not atomically deliver any messages at time c or
later, or p atomically delivers m. Since p does not atomically deliver m and c < c',
p halts before atomically delivering m'. This contradiction completes the proof.
Claim: WitpC,mI > n --H f
Proof of claim: We showed above that W?trCwmA c WitpC,m If WitCrwmAI > n --H f, then
IWitpC,mI >n--Hf.
Hence, suppose that WitCr:mA I < n --H f. Since r is correct, Lemma 7.30 implies
that bc(m) is faulty.
Consider any process 5, 5 ? WitrC,m, 5 ? Witcr:mA Lemma 7.26 implies that for all
correct processes u, 5 ? WitC??,m2A Since p is faulty, bc(m) is faulty, and p $ bc(m),
Lemma 7.33 implies that 5 ? WitpC?,m2A Thus, (WitrC,m --H WitrC7n?) C WitpC:m2A
From the above and WitCr:mA c WitpC,m, we conclude that WitrC,m C WitpC,m Since
r atomically delivers m, WitrC,mI > n --H f, and hence WitpC,mI > n --H f. c
112
Theorem 7.43 Suppose n > 2f, and f> 1. The protocol in Figure 7.5 is an atomic
broadcast protocol that ensures VBD-inconsistent behavior, tolerates omission failures
and has a latency of A.
7.4.3 Ensuring BD-consistency
Finally, we show that no atomic broadcast protocol can ensure BD-consistency in a
system subject to omission failures.
Recall that in Chapter 6, we defined a broadcast protocol to be "non-blocking."
Informally, we assumed that a process is not permitted to initiate a broadcast, and
"suspend" its application protocol until the broadcast terminates. The impossibility
result below is only valid for such "non-blocking" protocols.
Theorem 7.44 No atomic broadcast protocol can ensure BD-consistency in a system
subject to omission failures.
Proof: Suppose for contradiction e is an atomic broadcast protocol that ensures
BD-consistency in a system 5 that is subject to omission failures.
We consider three possible executions of the protocol. In all these executions,
we assume that no message is received in less than A time; thus either a message is
received exactly A time after it was sent, or it is never received.
Scenario 1: Consider an execution of the protocol in 5, in which a process p
attempts to broadcast a message m at time c. Suppose that no other messages are
broadcast.
Suppose further that all processes are correct. Thus, p correctly broadcasts m,
and changes state at time c. Furthermore, by the properties of atomic broadcast, all
processes eventually atomically deliver m.
Scenario 2: Consider a second execution of the protocol in 5, in which p attempts
to broadcast m at time c. Suppose that no other messages are broadcast.
Suppose further that p is faulty, that all processes other than p are correct and
that p fails to send any messages to any process at time c. Since Scenarios 1 and
2 are indistinguishable to p until time c, p changes state at time c (as it does in
Scenario 1).
Now suppose that p halts at time c + 1. Since p atomically broadcasts m at
time c, and p changes state at time c, and e ensures BD-consistency with respect to
atomic broadcast, p is BD-correct at time c. Therefore, all processes other than p
eventually atomically deliver the message m. Intuitively, this is impossible because
they cannot "guess" the contents of the message m. This is formalized below.
113
Scenario 3: Consider a third execution of the protocol in 5, in which p atomically
broadcasts a message m' $ m at time c. Suppose that no other messages are broad-
cast. Suppose further that p is faulty and all processes other than p are correct,
Suppose that p fails to send any messages to any process at time c, and then halts
at timec+ 1.
Since Scenarios 2 and 3 are indistinguishable to any process q $ p, q must even-
tually atomically deliver m (as in Scenario 2). However, no process broadcasts m
in this scenario, and hence the execution violates the specifications of atomic broad-
cast. This contradicts the assumption that 0- is an atomic broadcast protocol, and
completes the proof.
Chapter 8
Atomic Multicast
In this chapter, we concentrate on atomic multicasts; these are atomic broadcasts
where the intended recipients of any broadcast message are a subset of the processes
in the system, called a group of processes.
We begin by defining a natural hierarchy of three types of atomic multicast.
Informally, all three of these atomic multicasts require that messages targeted to
a particular group are delivered in the same order by processes belonging to that
group. However, the multicasts differ in the requirements placed on the relative
order of delivery of messages multicast to different groups.
We concentrate on patrwtse atomic multicast, which requires that if two groups
g and g' overlap, then all messages that are multicast either to group 9 or to group
9 are delivered in the same order by the processes that belong to both groups 9
and g'. We define inconsistency and contamination with such a multicast. We
also describe pairwise atomic multicast protocols that prevent contamination and/or
ensure consistency in systems that are subject to omission failures.
8.1 Preliminaries
In some applications, the system is configured as a collection of (possibly overlapping)
groups, each consisting of a subset of processes. A multicast is a broadcast that is
targeted exclusively to the members of some particular group. We assume that
groups are static, and that each process knows the groups in the system, the groups
to which it belongs and the members of each of the groups. Let belongs(p) denote
the set of all the groups to which a process p belongs.
Formally, a group g is a name of a subset of the processes in the system. When
there is no ambiguity, we identify the name of a group with the set of processes that
114
115
belong to that group. Thus, we say p is in g C 9), when process p belongs to the
group named g.
Since each message is multicast to a particular group, we assume a multicast
message is a tuple of the form m = (p, c, 9, data), where p is the multicaster of the
message, c is the timestamp of the message, 9 is the group to which m is targeted,
and data is the information that p wishes to multicast; we write group(m) = 9.
Without loss of generality, we assume that if p multicasts a message m to group 9,
then p is in 9.
We consider three types of atomic multicasts, local, pairwise, and global, each one
of which satisfies the following properties:
o+ Validity: If a correct process multicasts a message m, then it eventually delivers
o+ Agreement: If a correct process delivers a message m, then all correct processes
in group(m) eventually deliver m.
o+ Integrnty: For any message m, a correct process p delivers m at most once, and
only if p is in group(m), and m was multicast by some process.
However, the three atomic multicasts require different message ordering properties.
These ordering properties form a hierarchy, each one corresponding to an increasing
degree of interaction between groups.
A local atomic mttlticast is a multicast that satisfies the above validity, agreement
and integrity properties as well as the following message ordering property:
o+ Local total order: If correct processes p and q both deliver messages m and m',
and group(m) = group(m'), then p delivers m before m' if and only if q delivers
m before m'.
Thus, only messages multicast to the same group have to be ordered with respect to
each other. Inconsistency and contamination are also defined on a per group basis
in the obvious manner.
If applications span several groups or interact with each other, the local total
order property (which was based on completely independent groups) is insufficient.
For example, consider a system with two groups 9 = fp, q, r? and g? = (p, q,
where all four processes (p, q, r, and s) are correct (Figure 8.1). Suppose messages
m'1 and m21 are multicast to gt and message m to 9. To satisfy local total order,
processes p and q must both deliver rn'1 and m21 in the same order. However, p is
allowed to deliver m before m'1 even though q delivers m after `n'?
We write m <m' if and only if some correct process delivers m before it delivers
m'. Thus, the above example shows that m < m'1 and m'1 < m. This may be
undesirable in certain applications, and is prevented by pairwise atomic m'ulticast, a
116
Figure 8.1: Message delivery according to local atomic multicasts
multicast that satisfies the validity, agreement, and integrity properties, as well as
the following property:
o Painvise total order: If correct processes p and q both deliver messages m and
m', then p delivers m before m' if and only if q delivers m before m'.
However, pairwise atomic multicast permits "global cycles" in message delivery
order. For example, consider a system with three groups, g = fp, q?, gi = fq, r?
and g" = fr, p? (Figure 8.2), where processes p, q and r are correct. Note that the
intersection of any two groups in the system consists of exactly one process. The
messages m, m1 and m" are multicast to groups g, g1 and g" respectively. Pairwise
total order allows process p to deliver m? before m, q to deliver m before ??, and r
to deliver m' before m11, leading to a global cycle in message delivery order.
Global atomic multicast, the strongest type of atomic multicast that we consider,
precludes such cycles, and reflects the intuition of atomic delivery being a "single,
indivisible" action. A global atomic multicast satisfies validity, agreement, and in-
tegrity properties, as well as the following property:
o+ Global total order: The relation < between messages is acyclic.
Note that the specifications of each of the atomic multicasts discussed above can
be augmented with the A-timeliness property resulting in three types of timed atomic
multicasts.
We focus on visible-broadcast/delivery inconsistency and contamination with re-
spect to pairwise atomic multicast. Thus, for the rest of this chapter, we will write
"inconsistency" when we mean VBD-inconsistency and "contamination" when we
mean VBD-contamination. 1
1Recall that in Chapter 3 we showed that the prevention of visible-
117
Figure 8.2: Message delivery cycle in pairwise atomic multicast
8.2 Inconsistency and contamination with
respect to pairwise atomic multicast
Recall that in Chapter 5 we derived necessary and sufficient conditions for the pre-
vention of inconsistency and contamination with respect to atomic broadcast from
the general definitions of inconsistency and contamination presented in Chapter 3.
However, these definitions were given for fault-tolerant broadcasts, and do not in-
clude any information about process groups. Hence, in general, they are not suitable
for fault-tolerant multicasts.
For example, with local atomic multicast, inconsistency and contamination is
defined on a per groups basis; thus, a process may be simultaneously inconsistent
with respect to a group g, and consistent with respect to another group g'. This
cannot be expressed in the formalism given in Chapter 3.
In contrast, with pairwise atomic multicast, a process is either inconsistent, or is
"consistent with all groups." Thus, inconsistency and contamination with respect to
pairwise atomic multicast can be expressed with minor changes to formal definitions
broadcast/delivery contamination with respect to any fault-tolerant broadcast is
equivalent to the prevention of broadcast/delivery contamination with respect to
that broadcast.
118
given in Chapter 3. In this section, we assume such changes have been made. In
particular, we assume that broadcast histories are extended to include annotations
about the groups to which messages are multicast, and that the set of well-formed
broadcast histories B is appropriately modified. However, a formal extension of the
material in Chapter 3 to include multicasts is beyond the scope of this thesis, and is
left for future work.
Informally, a process p is not VBD-correct until some time c with respect to
pairwise atomic multicast if p and some correct process q both belong to some group
9, and they disagree on the sequence of messages delivered in g. In addition, p is not
VBD-correct if there are two groups g and g1, and a correct process q, such that both
p and q belong to 9 and g', and they disagree on the interleaving of the messages
delivered in 9 and g'.
For all broadcast histories B, for all subsets of groups G, D?vDpC(B) restricted to G,
denoted D?vDpC(B)I&, is the sub-sequence of messages that process p delivers by time
c in any group 9 in &. We write D?vDpC(B)I9 as an abbreviation for Drvi??c(B)If9?.
When the broadcast history B is obvious from the context, we omit the "(B)" from
the above notation. For any two processes p and q, Gp?q is the set of all groups to
which both p and q belong; formally, Gp,q = ?gIp E 9, q ?
We now define visible-broadcast/delivery correctness with respect to pairwise
atomic multicast.
Definition Let B = (p, ?) ? 13 and F be such that B satisfies pairwise atomic
multicast with respect to F. Process p is VBD-correct until c with respect to pairwise
atomic multicast and F if and only if:
o+ For all correct processes q, the sequence of messages p delivers by time c
restricted to Gp?q is a prefix of the sequence of messages q delivers during the
?
entire execution also restricted to Gp,q: D?vDpC(B)?Gp,q --H q(B)f Gp,q.
o+ If a faulty process q (i.e., q ? F) delivers a message m that p multicasts at
c then a correct process in group(m) eventually delivers m.
o+ If a correct process q (i.e., q ? F) delivers a message m that p multicasts to
group g, then for all messages ?? that p multicasts to 9 before multicasting m,
q eventually delivers m'
Note that the above definition is a generalization of Lemma 5.5, which derives
necessary and sufficient conditions for a process p to be VBD-correct with respect
to atomic broadcast. The difference between the above definition and Lemma 5.5
arises from the following. With atomic broadcast, the correct processes all deliver
the same sequence of messages; however, with pairwise atomic multicast, the correct
119
processes may deliver different sequences of messages, depending on the groups to
which they belong. However, if there is only one group of processes in the system,
and that group includes all the processes in the system, then the above definition is
identical to Lemma 5.5.
In Lemma 8.1, we derive sufficient conditions for visible-broadcast/delivery con-
sistency. In Lemma 8.2, we derive necessary and sufficient conditions to ensure
visible-broadcast/delivery contamination-free histories. The lemmas are stated us-
ing the following fifo order property:
o+ FIFO order: Suppose a process multicasts a message m to a group g before it
multicasts a message m' to 9. If a correct process p delivers m', then p delivers
m before ??
We also use the following property:
o+ Uniform agreement: If any process delivers a message m, then all correct pro-
cesses in group(m) eventually deliver m.
Lemma 8.1 Let A ? A, B be the broadcast history of A, and F be such that B
satisfies pairwise atomic multicast with respect to F. If B satisfies the fifo order
property with respect to F, and for all processes p, for all times c, for all correct
processes q, D?vDpC(B)JGp,q ? D?vDq(B)IGp,q, then A is a VBD-consistent history
with respect to pairwise atomic multicast and F.
Proof: Suppose that B satisfies the fifo order property, and that for all processes p,
for all times c, for all correct processes q, D?vDpC(B)?Gp,q ? D?vDq(B)IGp,q Clearly,
this implies that any message delivered by any process (correct or faulty) in any
group at any time, is also eventually delivered by all the correct processes in that
group; i.e., B satisfies the uniform agreement property with respect to F.
To complete the proof, we show that for all processes p, for all c < LVT(A, p), p
is VBD-correct until time c in B with respect to pairwise atomic broadcast and F.
Let p be any process. If p is correct, then it is trivial to show that for all c, p is
VBD-correct until time c in B with respect to pairwise atomic multicast and F.
Suppose p is faulty (p e F). Let c be any time < LVT(A,p). Since for all cor-
rect processes q, D?v?C(B)?Gp,q ? D?vDq(B)j&p?q, and B satisfies both the uniform
agreement and the fifo order properties with respect to F, the definition of VBD-
correctness given above implies that p is VBD-correct until c in B with respect to
pairwise atomic multicast and F.			0
The following lemma derives necessary and sufficient conditions to ensure VBD-
contamination-freedom with respect to pairwise atomic multicast. The proof is sim-
ilar to the proof of Theorem 5.11.
120
Lemma 8.2 Let B = (p, ?) ? 13 and F be such that B satisfies pairwise atomtc
multicast with respect to F. Suppose B satisfies fifo order with respect to F. A
correct process p is VBD-contamination-free until c with respect to pairwise atomic
multicast and F if and only tf:
o+ if p delivers any message m at or before time c, then:
for all correct processes q, D?vDt?S7???)?(B)?Gq,????? ? D?vDq(B)IGq,?c?m?
9			9'
ts(m'1) < ts(m'2)
Figure 8.3: Pairwise atomic multicast
The following example illustrates inconsistency and contamination with respect
to pairwise atomic multicast. Consider a system with two groups, 9 and gi, such
that g = (p,q,r? and gflg' = (p,ri (Figure 8.3). Suppose messages m'1 and m'2
are atomically multicast to group 9'. Suppose, r atomically delivers m'1 followed by
m'2. Suppose that p is faulty; it atomically delivers m12 by some time c, but fails
to atomically deliver m11. Thus p is inconsistent at time c with respect to pairwise
atomic multicast. If the faulty process p atomically multicasts m to group 9 at time
c (i. e., after atomically delivering m'2), and process q delivers m at some time c',
then q is contaminated at time c' with respect to pairwise atomic multicast.
8.3 Overview of the pairwise atomic multicast
protocols
For any group g, let n9 denote the number of processes in group 9, and fg be an
upper bound on the maximum number of processes in group g that may fail. Let
121
A9 = (J9 + 1 )A, where A is the maximum link delay. For any two intersecting groups
g and 9t? let n9,9i denote the number of processes in the intersection of g and g1, and
f9,9' be an upper bound on the maximum number of processes in the intersection of
g and g' that may fail. We assume that for all groups 9, f9 > 1; that is, at least one
process may fail in every group.
Our multicast protocols have the desirable properties informally given below:
Group tocality: The processing of a multicast targeted to a particular group
does not involve any process that does not belong to that group. That is, the
processing of a multicast does not "spill over" to other groups.
Time Thcalit?: The latency of a multicast to group 9 is proportional to the
size of 9. Thus, a multicast to a smaller group takes less time than a multicast
to a larger group.
We present two different pairwise atomic multicast protocols that prevent the
contamination of correct processes:
o+ Protocol 1 (Section 8.5.1): This protocol requires fl9,9' > 2f9,9? for any two
intersecting groups 9 and 9'. It has a latency of A9 + A for every group 9.
o+ Protocol 2 (Section 8.5.2): This protocol requires n9 > 2f9 for every group 9,
and requires fl9,9' > f9,9t for any two intersecting groups 9 and 91 It has a
latency of 2A9,for every group 9.
Our protocol that prevents inconsistency (Section 8.6) is derived from Protocol
lit requires n99' > 2f9,9? for any two intersecting groups 9 and 9'. It has a latency
of A9 + A for every group 9.
As in the case of atomic broadcast (Chapter 7), we present our protocols mod-
ularly in terms of a communications abstraction called a basic multicast. Formally,
a basic multicast is a timed reliable multicast (i.e., a multicast that satisfies the va-
lidity, agreement, integrity and A9-timeliness properties for each group 9) that also
satisfies the following uniform fifo order property:
o+ Uniform fifo order: Suppose a process multicasts a message m to a group 9
before it multicasts a message m' to 9. If any process p delivers m', then p
delivers m before m'.
It is straightforward to derive a basic multicast protocol with a latency of A9 for
message delivery in any group 9.
122
8.4 Atomic multicast
Our simple atomic multicast protocol is very similar to our simple atomic broadcast
protocol (Figure 8.4). The difference is that if a process q basic-delivers a message
m in a group g, then q atomically delivers the message at time ts(m) + A9. Clearly,
q must also belong to 9.
1* p atomically multicasts a message m = (p, c, 9, data) at time c to group 9*1
To multicast (pairwise atomic, m = (p, c, 9, data)) p multicasts (basic, m)
p delivers basic multicast messages at c
For all groups 9 ? belongs (p)
MpC,9 = tm I c = ts(m) + A2,p delivered (basic,m)?
p atomically delivers messages at c
M --H= U???eiongs(p) Mp$9
D =--H sort M in timestamp order
p delivers (pairwise atomic, D)
Figure 8.4: Pairwise atomic multicast protocol
Executed by process p at time c
Theorem 8.3 The protocol in Figure 8.? is a pairu'ise atomic multicast protocol that
tolerates omission failures, and has a latency of A9 in each group 9.
8.5 Pairwise atomic multicast with no
contamination
Recall that the protocol to prevent contamination with atomic broadcast was based
on a "local" check for "mutual inconsistency." This local check is not sufficient to
prevent contamination with pairwise atomic multicast, as illustrated by the following
example.
Consider the system illustrated in Figure 8.3. Recall that the figure depicted a
system with two groups, g and g', such that g = (p,q,r? and gAg' = (p,r?. To
123
make the example more readable, we will use the words "broadcast" and "deliver"
when we mean "atomically broadcast" and "atomically deliver."
Suppose messages m'1 and m'2 are multicast to group 9'. Suppose, r delivers m'1
followed by m'2. Suppose that p is faulty; it delivers m'2 by some time c, but fails
to deliver m'1. Note that p is inconsistent at time c with respect to pairwise atomic
multicast. Finally, suppose that the faulty process p multicasts mto group 9 at time
c (?.e., after delivering m'2).
Since p and r disagree on the sequence of messages delivered to group gi by time
c, to avoid becoming contaminated, r cannot deliver the message m.
Suppose that q is correct. When q receives m from p, it cannot "notice" p's
inconsistency: q does not belong to group 9!? 50, by the group locality property, q is
not even aware of the message m'1 that p failed to deliver. Thus if q applies a local
consistency check to p, it will decide to deliver m. In such a case, q will disagree
with r on the delivery of m. This violates the requirement that all correct processes
in group g agree on the messages delivered in g.
Thus, to decide whether or not to deliver m, q cannot make a purely local de-
cision (as was the case with atomic broadcast), but must rely on "help" from the
processes in gflgi Consequently, it is clear that preventing contamination with re-
spect to pairwise atomic multicast is more difficult than preventing contamination
with atomic broadcast.
Suppose q "consults" with the processes in gfl9' (in this case processes p and r)
to determine whether p was inconsistent at the time it multicast m, and q delivers
m only if p was consistent. Thus, q must decide if the correct processes in group 9'
delivercd m'1 before m'2, or if they only delivered m'2; that is, q must decide whether
p is faulty or whether r is faulty. If r is correct, the agreement property of atomic
multicast implies that q must not deliver m. If p is correct, the validity property of
atomic multicast implies that q must deliver m.
To determine whether or not to deliver m, q cannot "consult" with processes
in group g' that are not also in group 9 (by the group locality property). Thus, it
appears that q cannot decide which of p and r is faulty unless more than half the
processes in the intersection of groups 9 and gi are correct. Surprisingly, this is not
the case.
We present two approaches by which q could determine which of por r is faulty.
The first approach, (Protocol 1, Section 8.5.1), requires a majority of correct pro-
cesses in the intersection of any two groups g and 9' (i.e., if gflg' $ ? then
fl99' > 2f9?r).
The second approach by which q could decide which of p or r is faulty does not
require a majority of correct processes in gflg? This is explained below.
124
Suppose that m'1 and m'2 had been multicast using a pairwise atomic multicast
that guaranteed both the simultaneity (Chapter 7) and the uniform agreement (de-
fined earlier in this chapter) properties.
With such a multicast, a faulty process cannot deliver "extra" messages. In
particular, if a faulty process delivers a message in a particular group, then all the
correct processes in that group will also deliver that message. Furthermore, if a
correct and a faulty process both deliver any message, then they both deliver the
message at the same local time. Thus, a faulty process cannot deliver messages
out-of-order. Indeed, the only way in which a faulty process may disagree with the
correct processes is by failing to deliver a message that is delivered by the correct
processes.
With such a multicast, it is clear that the scenario illustrated in Figure 8.3 can
arise only if p is faulty. In particular, since r delivers m'1 at some time c', the
uniformity and simultaneity properties imply that all correct processes in 9' also
deliver m'1 at time c'. Thus, since p does not deliver m'1 at time c', and p E 9', p
cannot be correct.
Protocol 2 (Section 8.5.2) is based on the approach discussed above. It requires
that if any two groups intersect, then there is at least one correct process in the
intersection of the two groups. It also requires that a majority of processes in each
group are correct; we show that this requirement is necessary for any pairwise atomic
multicast protocol that prevents the contamination of correct processes.
8.5.1 Protocol 1
To atomically multicast a message m = (p, c, 9, data) to group 9 at time c, process p
multicasts (basic, b) to 9, where b is the tuple b = (p, c,g, (data,D?vDpC)), and D?DpC
is the sequence of all the messages that p has atomically delivered by time c in all
the groups to which p belongs.
If process q delivers (basic, b), by time c + A9, then for each group 9' to which
both p and q belong, q does the following. If D?vDqC Ig' = D?vDpC Ig', then at time
c+ A9, q sends (q,b,g',ACK) to all processes in 9.
At time c + A9 + A, q checks to see if, for every group 9, to which p belongs, a
majority of processes in the intersection of 9 and 91 agree with p on the sequence of
messages atomically delivered in 9'. If so, then q atomically delivers (p, c, 9, data);
otherwise, q ignores the message.
In response to a process p's multicast of a message m, the protocol requires that a
process q send up to jGp,q messages of the form (q, --H, --H, ACK). It is straightforward
to modify the protocol so that q sends at most one such message. Furthermore,
125
techniques similar to those proposed in Section 7.3 can be used to further reduce
the message complexity of this protocol. For clarity, however, we present the less
efficient protocol informally described above (Figure 8.5).
To simplify the presentation, if b = (p, c, 9, (data,--H)) and m = (p, c, 9, data), we
write ts(b) to mean ts(m) (i.e., c) and bc(b) to mean bc(m) (i.e., p).
Lemma 8.4 If any process atomically delivers a message m, then it atomically de-
livers m at time ts(m) + Agroip(m) + A.
Lemma 8.5 If p and q atomically deliver messages m and m', then p atomically
delivers m before m1 if and only if q atomically delivers m before m
Lemma 8.6 If a correct process atomically delivers m, then all correct processes in
group(m) eventually atomically deliver m.
Proof: Suppose for contradiction that m is the earliest message (i.e., the smallest
message in lexicographic order) such that some correct process p atomically delivers
m, and some correct process q in group(m) does not atomically deliver m. Suppose
that m = (x,c,g,data) and let b denote (x,c,g,(data,??v??)).
Since p atomically delivers m, b ? BpC+Ail, and by the properties of the basic
multicast, for all correct processes r E g, b e BrC+A9; in particular, since q is correct
and q ? g, b ? ?qC+AS
Since q does not atomically deliver m at time c + A9 + A, x belongs to some group
g*, such that jacksq,b,9* < n9,9* --H f9,9?.
Since p atomically delivers m, for all groups gI to which x belongs, Iacksp??,9i
n9,9? --H f9,9'; in particular, acksp,?,9* > n9,9* --H fg,g*. Since n9,9* > 2f9,g* (by
assumption), at least one correct process r sent a message (r, b,g*, ACK); i.e., there
is at least one correct process r in g A g* such that D?vDzC Ig* = D?vDrC g*
Let s be any correct process in gflg*. By Lemma 8.4 and by choice of m (as
the earliest message on which the correct processes disagree), D?vDrC Jg* = ??v?c8 jg*.
Since s is correct, b ? B8c+A? (as we showed above), and hence 5 sends (s, b, g*, ACK)
at time c + A9.
Since process q is correct, q receives at least fl9,9* --H f9,9* messages of the form
(--H, b,g*, ACK); hence Jacksq,?,9* > n99* --H f9,9*. This is a contradiction to the
earlier assertion that acksq,?,9* < fl9?* --H f9,9*. c
Lemma 8.7 For all correctprocessesp andq, and all times c, D?vDpC?Gp,q = D?vDqC?Cp,q
Proof:			From Lemmas 8.4 and 8.6.			E]
126
p atomically multicasts a message m = (p, c, 9, data) at time c to group ?
To multicast (pairivise atomic, m = (p, c, 9, data))
p multicasts (basic,b = (p,c,9,(data,D?vDpC)))
p delivers basic multicast messages at c
BpC =--H ?b I ts(b) = c --H Agro?p(b), P delivered (basic, b)?
p sends messages at c
For all b ? BpC for all ? e Gp,bc(b)
if D?vDt?c(?b?? 19 = ?rv?s(b)I9 then p sends (p,b,g,ACK) to all in grcup(b)
/* p receives all messages at c
For all b ? BpC?A for all groups 9 ? be1on?s(bc(b))
ackp,?,g = fq p received (q,b,?,ACK)J
p atomically delivers messages at c
for all b Ei BpC?A, b =(q,c?A?Ag*,9*,(data,?))
if for all groups 9 ? be1on?s(bc(b)) : Iackp,?,9I > flgg* --H fg?* then
5 = SU f(q,c--H A --H A9*,?*,data)?
D =--H sort 5 in timestamp order
p delivers (pairwise atomic, D)
Figure 8.5: Pairwise atomic multicast protocol that prevents contamination
Executed by process p at time c
127
Lemma 8.8 If the mvlticaster of a messa?e m is correct, then all correct processes
in group(m) eventvally atomically deliver m.
Proof:
to
to
Suppose a correct process p atomically multicasts a message m at time c
group 9. Thus process p basic-multicasts the message b = (p, c, 9, (data, D?vDpC))
group 9 at time c.
Let q be any correct process in 9. The validity and agreement properties satisfied
by the basic multicast imply that q basic-delivers b by time c + A9. Therefore,
to prove the lemma it is enough to show that by time c + A9 + A, for all groups
9' c belongs(p), q receives at least fl9,9, --H f,,?' messages of the form (--H, b, ?`, ACK).
Let 9? be any group such that gi ? belongs(p). Let r be any correct process
in gfl 9'. The validity property of the basic multicast implies that b ? BrC+A9
Lemma 8.6 implies that DU+vDrC 19' = DLvwcI9?. Thus, r sends a message of the form
(r, b, gI, ACK) to all processes in group 9. Therefore, q receives at least n9,9, --H f9,9'
messages of the form (--H, b, g1, ACK).			E)
Lemma 8.9 If a correct process p atomically delivers a message m, then for all
correct processes q, D?vDt?Sf(?m)) IGq,bc(m) = DLVDtqJ(tfl)?Gq,?????
Proof: Suppose for contradiction that m is the earliest message, such that a
correct process p atomically delivers m, and DLVDt?S?(fnin)) IGq,bc(im) $ ??v?\;(?) IGq,bc(m)
for some correct process q. Let m = (x,c,g,data) and b = (:?c,9,(data,DLvD:C))
Lemma 8.5 implies that there is a group 9' C Gq,: such that DLvD:I9' $ D?vDqC?gI
Since 9' E Gq,:, and x belongs to both groups 9 and 9', 9fl9' $ ?.
Let r be any correct process in 9fl9'. Since q and r are correct, and both belong
to group 9', Lemma8.7 implies that D?vDrCI9I = D?vDqC?gI; thus, D?vD:CI# $ D?vDrCI9,
Thus, r does not send a message of the form (--H, b,9, ACK). Since ?9# > 2f9,9i, p
receives fewer than n9,9' --H f9,9' messages of the form (--H, b,9, ACK). Hence p does
not atomically deliver m. This is a contradiction. E]
Corollary 8.10 If a?rrect process p atomically delivers a message m, then for all
correct processes q, DLvDt?S?(???)?jGg,????? ? D?vDqlGq,6c?m?
Lemma 8.11 If p atomicall? multicasts a message m to a group 9 before atom?-
cally mvlticasting m' to 9, and any correct process q atomically delivers m', then q
atomically delivers m before m'.
Proof: The proof is by contradiction. Suppose for contradiction that p atomically
multicasts m to group 9 before m' to 9 (i.e., ts(m) < ts(m')), and a correct process
128
q atomically delivers m', but does not atomically deliver m by the time it atomically
delivers m'. Let c denote ts(m) and c' denote ts(m').
Since q atomically delivers m1, q delivers (basic, m') by time c' + A9. Hence, by
the fifo order and A-timeliness properties satisfied by basic multicast and delivery,
q delivers (basic, m) by time c + A9. Since q did not atomically deliver m, for some
group g* such that p belongs to g*, for all correct processes r in 9 flg*, D?vDrCl? #
DLVDpCl9*
Since q atomically delivers m' at c' + A9 + A, for all groups g' such that p belongs
to gi, for all correct processes r in gAg', DrvDc;lg, --H Drvwc' Ig' In particular, for all
correct processes r E gfl g*, D?vDrC' 19* = D?vDpC' 19*. Since c < c' this contradicts
the choice of the group g*, and completes the proof. 0
Theorem 8.12 Suppose for all 9roups g and 9', if9fl9' # ?, then n9,9' >
The protocol in Figure 8.5 is a pairwise atomic multicast protocol that prevents the
contamination of correct processes, tolerates omission failures and has a latency of
A9 + A in each group 9
Proof: It is clear that the integrity property is satisfied. Hence Lemmas 8.7, 8.8,
and 8.5 imply that the protocol is a pairwise atomic multicast protocol. Lemma 8.2,
together with Corollary 8.10 and Lemma 8.11, implies that the protocol prevents
contamination with respect to pairwise atomic multicast.
The latency of A9 + A follows directly from the protocol.			0
8.5.2 Protocol 2
The previous protocol requires that more than half the processes in the intersection
of any two groups are correct. This seems necessary to identify the faulty processes
in the intersection of the two groups. However, this is not necessary. To prevent
contamination, it is sufficient for each group in the system to have a majority of
correct processes. This is the case in the protocol in Figure 8.7, which requires that
n9 > 2f9 for all groups 9. As we mentioned in the introduction to this section,
the protocol ensures both the simultaneity (Lemma 8.23) and uniform agreement
(Lemma 8.27) properties.
The protocol uses (as subroutines) the following multicasts:
o+ Uniform multicast: This satisfies the validity, uniform agreement, integrity and
A9-timeliness properties. We do not present a uniform multicast protocol;
129
however, such a protocol is easily derived from our atomic broadcast protocol
that prevents inconsistency and has a latency of A (Chapter 5).
Strong uniform mutticast: This is a uniform multicast that also satisfies the
property stated informally below, and formalized in the next section:
Suppose any process delivers a message rn. Any process p that is "aware" of
the multicast of m either delivers m or halts by time ts(m) + A9ro?(m).
Strong uniform multicast
A strong uniform multicast (called a strong multicast for short) is a uniform multicast
that satisfies the following properties. For all groups g, for all processes p in g, for all
time c, the processes in g determine whether or not to allow p to multicast a message
at time c as follows:
o+ If a correct process q allows p to multicast a message to g at time c, then all
correct processes also allow p to multicast a message to g at time c.
o+ If p multicasts a message m to group g at time c, then p allows itself to multicast
a message to group g at time c.
o+ If any process delivers a message m = (p, c, 9,--H), then for all processes q in
g, either q delivers rn by time c + A9, or q halts by time c + A9, or q did not
allow p to multicast a message to group g at time c.
In our strong.multicast protocol (Figure 8.6), the variable allowq(p, c, g) denotes
whether q allows p to strong-multicast a message to group g at time c. We assume:
o+ allowq(p, c, g) is either defined by q at time c before any messages are sent or
multicast, or remains undefined.
o+ allowq(p, c, 9) is defined to be true if and only if q allows p to strong-multicast
a message to group 9 at time c.
o+ For all groups g, for all processes p in 9, for correct processes q and r in g, for
all c, allowq(p, c, g) = true if and only if allowr(p, c, 9) = true.
o+ If p strong-multicasts a message m at c to 9, allowp(p, c, 9) = &ue.
We also assume that at least one process in each group may be faulty (i.e.,
V9,f9>1).
To multicast (strong, m) to group 9 at time c, p sends m to all processes in 9.
Suppose a process q allows p to strong-multicast a message to 9 at time c; i.e.,
allowq(p, c, 9) is true. When q first receives m, q sends m to all processes in 9.
At time c + A9 --H A, q sends a message to all processes in 9, indicating whether
it has received a message m multicast by process p at time c (an "ack") or not (a
At time c + A9, q first determines whether it is faulty and, if so, it halts.
130
o+ If q does not receive at least n9 --H f9 of the messages sent at time c + A2 --H A,
then q has failed to receive messages from the correct processes; hence q halts.
o+ Suppose that either q = p or q receives m by time c + A9 --H 2A. If q does not
receive at least n9 --H f9 "ack" messages, then q has failed to send m properly;
hence q halts.
At time c + A9, if q has not halted, then q strong-delivers m.
p multicasts (strong, m = (p, c, 9, data)) at time c to group 9
/? For all processes q, allowq(p, c, 9) is a boolean that is known by time c
If allowp(p, c, 9) then at time c p sends m to all in g
/* Executed by any process q
If allowq(p, c, 9) then
when q # p first receives a message of the form m = (p, c, 9,--H):
q sends m to all in 9
At time c + A9 --H A
if q has received m = (p, c, 9,--H)
then q sends (q,(p,c,g),ACK) to all in 9
else q sends (q,(p,c,g),NAK) to all in 9
At time c + A9
nakq(p,c,g) := (rq received (r,(p,c,g),NAK)?
ackq(p, c, 9) := friq received (r, (p, c, 9), ACK)1
if iackq(p,c,g)i + inakq(p,c,g)l <n9 --H f9 then q halts
if(q=porqreceived(p,c,g,--H)byc+A9--H2A)
and iackq(p, c, 9)1 <n9 --H f9 then q halts
if q received m = (p, c, 9,--H) by c + A9 then q delivers (strong, m)
Figure 8.6: Strong multicast protocol
131
Lemma 8.13 If any process q strong-delivers a message of the form (p, c, g,
then allowq(p, c, g) is true.
Lemma 8.14 For all groups g, for all p and q in g, for all c, if allowq(p,c,g) is
true and allowr(p, c, g) is not true for some correct process r in g, then q halts by
time c + A9 before delivering any messages at time c + A9.
Proof: Suppose for some group g, for processes p and q in g, for some c,
allowq(p, c, g) is true and allowr(p, c, g) is not true for some correct process r in g.
In other words, q allows p to strong-multicast a message at time c, but some correct
process r in g does not.
Let s be any correct process in g. Since allowr(p, c, g) is not true, and r is
correct, allow?(p, c, g) is not true (by assumption). Therefore, at time c + A9 --H A, s
does not send a message of the form (s, (p, c, g), --H), and hence s ? ackq(p, c, g) and
5 ? nakq(p,c,g).
Since n9 > 2f9, process q receives at most f9 "acks? and "naks," and hence we
conclude that ackq(p,c,g) + nakq(p,c,g) < n9 --H f9. Therefore, q halts by time
c+A9.
Since q determines whether or not to halt itself at any time before delivering any
messages at that time (see Figure 6.1 of Chapter 6), we conclude that q halts by
time c + A9, before delivering any messages at time c + A9. E
Lemma 8.15 A correct process never halts.
Proof: Assume for contradiction that q is the first correct process to halt. Thus,
from the protocol, there is some group g to which q belongs, some process p in g,
and some time c such that allowq(p, c, g) is true, and q halts at time c + A9.
Case: ackq(p,c,g) + nakq(p,c,g)I < n9 --H f9 : Recall that by assumption, all
correct processes in group g agree on whether p is allowed to multicast a message
to group g at time c. Since q is correct and allowq(p, c,g) is true, for all correct
processes r in g, allowr(p, c,g) is true.
Since q is the first correct process that halts (at time c + A9), no correct process
halts by time c + A9 --H A. Thus, all correct processes in group g send a message of
the form (--H, (p, c, g), --H). Since q is correct, and n9 > 2f9, q receives at least n9 --H f9
such messages; i.e., ackq(p,c,g) + nakq(p,c,g) > n9 --H f9. This is a contradiction.
Case: (q = p or q received m = (p,c,g,--H) by c + A9 --H 2A) and ackq(p,c,g) <
n9 --H f9: Suppose q = p, or q received m by c + A9 --H 2A. In either case, (since
132
f9 > 1 and hence A9 --H 2A > 0) q sends the message m by time c + A9 --H 2A to all
processes in g.
Let r be any correct process in group g. Since q is correct, and allowq(p, c,
is true, allowr(p,c,g) is true. Thus, r receives m by time c + A9 --H A, and sends
the message (r, (p, c, 9), ACK). Furthermore, this message is received by q by time
c+ A9.
Since there are at least n9 --H f9 correct processes in group g, q receives at least
n9--Hf9 messages of the form (--H,(p,c,g),ACK); i.e., Iackq(p,c,g)I > n9 --Hf9. This
is a contradiction.			c
Lemma 8.16 If a correct process q receives m = (p,c,g, --H), and allowq(p,c,g) is
true, then all correct processes strong-deliver m by c + A9.
Proof: Suppose a correct process q receives m = (p, c, 9,--H), and allowq(p, c, g) is
true. By assumption, for all correct processes r in g, allowr(p, c, g) is also true. By a
standard "pigeonhole" argument, it is easy to show that all correct processes receive
m by time c + A9. Since correct processes cannot halt (Lemma 8.15), we conclude
that all correct processes strong-deliver m by time c + A9. 0
Lemma 8.17 If a correct process strong-multicasts m at time c to g, then:
All correct processes in g receive and relay m by time c + A.
All correct processes in 9 strong-deliver m by time c + A1.
0
0
Lemma 8.18 The protocol in Figure 8.6 satisfies the validity, agreement, A9-timeliness,
and integrity properties.
Lemma 8.19 If any process strong-delivers a message m = (p, c, g, --H), then for all
correct processes q in g, allowq(p, c,g) is true.
Proof: Suppose a process p strong-multicasts a message m at some time c to
a group g, and a process r strong-delivers m. Thus, allowr(p, c, g) is true. Since r
strong-delivers m, it does not halt before delivering messages at time c + A9. Hence,
Lemma 8.14 implies that for all correct process q in 9, allowq(p, c, g) is true. 0
Lemma 8.20 If any process strong-delivers a message m = (p, c, g, --H), then all
correct processes in g strong-deliver m at time c + A9.
133
Proof: Suppose p strong-multicasts message m at time c to group 9, and a process
q strong-delivers m. First note that allowq(p, c, 9) is true and that by Lemma 8.19,
for all correct processes r in 9, allowr(p, c, g) is true.
If p is correct, then the lemma follows from Lemma 8.17. If q is correct, then the
lemma follows from Lemma 8.16. Hence suppose p and q are faulty. The proof is a
case analysis.
Case: q = p or q receives m by time c + A9 --H 2A: Since q strong-delivers m
and does not halt, ackq(p, c,g)J > n9 --H f9. Since n9 > 2f9, some correct process
r ? ackq(p,c,g); i.e., r receives and relays m by timec+A9--HA. Since allowr(p,c,g)
is true, Lemma 8.16 implies that all correct processes in 9 strong-deliver m by time
c+A9.
Case:
q $ p and q receives m after time c + A9 --H 2A: By a standard "pigeonhole"
argument, at least f9 --H 1 processes, not including p nor q, receive and relay m by
time c + A9 --H 2A. Since p and q are faulty, at least one of those f9 --H 1 processes
is correct. Since for all correct processes r in 9, allowr(p, c, 9) is true, Lemma 8.16
implies that all correct processes in 9 strong-deliver m by time c + A9. 0
Lemma 8.21 Suppose a process strong-delivers a message m = (p,c,g, --H). If any
process q in 9 does not strong-deliver m at time c + A9, and allowq(p,c,g) is true,
then q halts by time c + A9, before delivering any messages at time c + A9.
Proof: Suppose p strong-multicasts m at time c in 9, and some process strong-
delivers m; by Lemmas 8.19 and 8.20, for all correct processes r in 9, allowr(p, c, 9)
is true, r receives m at time c + A9 and strong-delivers m by time c + A9.
Suppose for contradiction that some process q in 9 does not strong-deliver m at
time c+ A9, allowq(p, c, 9) is true and that q does not halt before delivering messages
at c + A9. Thus, q does not receive m and 1ackq(p,c,g)I + jnakq(p,c,g)1 > n9 --H f9.
Furthermore, since q does not strong-deliver m, and all correct processes strong-
deliver m, q is faulty.
Since q is faulty and does not receive m, and the correct processes in 9 receive
m, we conclude from the "pigeonhole" principle that at least one correct process in
9 receives m by time c + A9 --H 2A. Thus, by time c + A9--HA, all correct processes in
g send m to all processes in 9.
Let r be any correct process in 9. From the protocol, since allowr(p, c, 9) is true, r
sends a message of the form (r, (p, c, 9), ACK) to all processes in 9 at time c+ A9 --H A.
By the fifo-links property (Chapter 5), if q receives this message, q receives r 5 relay
of message m by time c + A9.
134
Therefore, q does not receive any message of the form (--H, (p, c, 9),--H) from any
correct process. Since n9 > f9, we conclude that ackq(p, c, 9)1 + inakq(p,c, 9)1
n9 --H f9. This is a contradiction.			0
Theorem 8.22 Suppose for all groups 9, n9 > 2f9 and f9 > 1. The protocol illus-
trated in Figure 8.6 is a strong multicast protocol that tolerates omission failures and
has a latency of A9 in each group 9.
Proof:			From Lemmas 8.18, 8.20 and 8.21.			0
With the strong multicast protocol presented above, for all groups 9, for all
processes p in 9, p may strong-multicast more than one message to 9 at any in-
stant of time using "piggybacking" techniques. Thus, if p wishes to strong-multicast
m = (p, c, 9, data) and m' = (p, c, 9, data1) at the same time, it actually strong-
multicasts the message m* = (p, c, 9, (data, data')). If a process q strong-delivers m*
at time c', for convenience, we say that q strong-delivers both m and m' at time c'.
This "transparent piggybacking" technique is assumed in the our pairwise atomic
multicast protocol that prevents contamination (Figure 8.7).
Pairwise atomic multicast protocol to prevent contamination
The pairwise atomic multicast protocol to prevent contamination, shown in Figure
8.7, is given using uniform multicast and strong multicast as subroutines.
To atomically multicast m = (p, c, 9, data) to group 9 at time c, process p
multicasts (uniform, b) to 9, where b is the tuple b = (p, c, 9, (data, D?vDpC)), and
??vD,? is the sequence of messages that p has atomically delivered by time c in all
the groups to which p belongs.
If a process q delivers (uniform, b) in 9 by time C+ A9, then q does the following
at time c + A9. If q atomically delivers a message m* by time c, p E group(m*),
and p does not atomically deliver m* by time c (i.e., there is a message m* that p
failed to atomically deliver), then q multicasts the message (q, c + A9, g, (b, NAK))
to group 9 using the strong multicast. Furthermore, for all processes r in 9, q allows
r to multicast a message (r,c+ A9,g,(b,NAK)); i.e., q sets allowq(r,c+ A2,g) to
true.
At time c+ 2A9, if q does not deliver (strong,(--H,c+ A9,g,(b,NAK))), then q
atomically delivers m; otherwise q ignores m.
Lemma 8.23 If any process atomically delivers a message m, then it atomically
delivers m at time ts(m) + 2A9roip(m).
135
Lemma 8.24 If for some group 9, allowq(--H,c + A9,g) is true for some process q
in 9 at time c+ A9, then q has unform?delivered a message b = (--H,c,g, (--H, --H)) by
time c + A9; i.e, b ? ?qC,+9AS
Proof: Directly from the protocol.
0
/? p atomically multicasts a message m = (p, c, 9, data) at time c to group g
To multicast (pairwise atomic, m = (p, c, 9, data))
p multicasts (uniform, b = (p, c, (data, D?vDpC)))
p delivers uniform multicast messages at c
For all 9 : BpC,9 =--H fb ts(m) = c --H A9, p delivered (uniform, b)?
p multicasts messages at time c using the strong multicast*I
For all 9 for all b ? BpC,9
for all q Ei 9 altowp(q,c,g)			true
if am* E D?vDpC?A9 and bc(b) ? group(m*) and m* ?			then
bc(b)
p multicasts (strong, (p, c, 9, (b, NAK)))
p atomically delivers messages at c
for all 9 for all b ? Bp%9AS, b=
if p did not deliver (strong, (--H, c --H A9,g, (b, NAK)))
and p atomically delivered q's previous multicasts to 9
then 5 = SU f(q,c--H 2A9,g,data)?
D =--H sort 5 in timestamp order
p delivers (pairwise atomic, D)
Figure 8.7: Pairwise atomic multicast protocol that prevents contamination
Executed by process p at time c
Lemmas 8.25 and 8.26 show that the assumptions about the "allow" variables
made for the strong multicast protocol are indeed satisfied.
?36
Lemma 8.25 For all groups 9, for all p ? 9, for all c, for all processes q in 9, if
allowq(p, c, 9) is true at time c, then:
o+ for all processes 5 in 9, allowq(s, c, 9) is true at time c, and
o+ if q is correct, then for all correct processes r in 9, allowr(p, c, 9) is true at
time c.
Suppose for some 9, for some p in 9, for some c, allowq(p, c, 9) is true
Lemma 8.24 implies that there is a message b = (--H, c --H
Proof:
for some process q in 9.
Ei Bq\g
Let 5 be any process in group 9. Since b ? BqC,g, q also sets allow q(s,c, 9) to true
at time c.
Suppose q is correct. Let r be any correct process in group 9. By the agreement
property of the uniform multicast, b e BrC,g Hence, from the protocol, r also sets
allowr(p, c, 9) to true at time c.
0
Lemma 8.26 If any process p strong-multicasts a message (p, c,g, (--H, NAK)), then
allowp(p, c, 9) = true.
Proof:			Directly from the protocol.			0
Lemma 8.27 If any process atomically delivers m, then all correct processes in
group(m) atomically deliver m at time ts(m) + 2A9.
Proof: Suppose a process q in some group 9 atomically delivers a message m
= (p, c --H A9,9, data) at time c + Ag. Thus by time c, q uniform-delivers the message
b = (p, c --H A9,g, (D?vDpC?AI, data)). Hence, from the protocol, for all processes 5 in
9, allowq(s,c,g) is true.
Since q uniform-delivers b, the uniform agreement property satisfied by uniform
multicast implies that for all correct processes r in 9, r also uniform-delivers b by
time c.
For all processes 5 in 9, since allowq(s, c, 9) is true, and q does not halt before
atomically delivering m at time ts(m) + A9, Lemma 8.14 implies that some correct
process allows 5 to strong-multicast a message at time c to group 9. Thus, from
Lemma 8.25, for all correct processes r in 9, for all processes 5 in 9, r allows 5 to
strong-multicast a message at time c to group 9; i.e., allowr(s,c,g) is true.
Since q atomically delivers m, q does not deliver (strong, (--H, c, g,(b, NAK))).
Since q does not halt before delivering messages at time c + A9, and for all sing,
137
allowq(s, c, 9) is true, Lemma 8.21 implies that for all correct processes r in g, r does
not deliver (strong, (--H, c, g, (b, NAK))).
Let r be any correct process in g. Since r urnform-deliv&s b by time c, and r
does not deliver (strong, (--H, c, g, (b, NAK))), we conclude that r atomically delivers
mat time c + A2.
Lemma 8.28 For all correct processes p and q, and all times c, D?v?C?Gp,q --H
D?vDCqjGp,q
Proof: Follows from Lemmas 8.23 and 8.27
0
Lemma 8.29 If the multicaster of a message m is correct, then all correct processes
in group(m) atomically deliver m at ts(m) + 2A2
Proof:
Suppose for contradiction that a correct process p atomically multicasts
a message m = (p, c, g, data) at time c to group g, and a correct process q E g does
not atomically deliver m.
Since p atomically multicasts m, and p is correct (and hence does not halt),
p also uniform-multicasts the message b = (p, c, g, (data, D?vDpC)) at time c. Since
p and q are correct, the validity and agreement property of the uniform multicast
implies that q uniform-delivers b by time c + A?; i.e., b E ?qC,+2?I Since q does not
atomically deliver m, q delivers (strong, (r, c + A2, g, (b, NAK))) for some process r
in 9. Therefore, there is a message m* such that r atomically delivers m* by time c,
p does not atomically deliver m* by time c, and p belongs to group(m*).
Since p is correct, and r atomically delivers m* by time c, Lemmas 8.23 and 8.27
imply that p also atomically delivers m* by time c. This is a contradiction. 0
Lemma 8.30 If p and q atomically deliver messages m and m', then p atomically
delivers m before m' if and only if q atomically delivers m before m'
Lemma 8.31 If a correct process p atomically delivers a message m, then for all
correct processes q,			IGq,bc(m) =			I Gq,bc(in).
Proof: Suppose for contradiction that m is the earliest message such that a
correct process p atomically delivers m, and DLVDt?c(??m)? IGq,bc(in) $ D?vi)??qS(Vfl) IGq,bc(m)
for some correct process q. Let m = (x,c,g,data) and b = (x,c,g,(data,D?VD:C))
Lemma 8.30 implies that there is a group g' ? Gq,: such that DT+VDC: Ig' $ D?vDqC?gI
138
In particular, Lemmas 8.23 and 8.27 imply that there is a message m*, such that
group(m*) =gI, m* ? DU+VDqC, x ?g', andm* ?D?vDC:
Since p atomically delivers m, p also uniform-delivers b; i.e., b E ?pC+gA? The
agreement property satisfied by uniform multicast implies that for all correct pro-
cesses r in g, b Ei ??C,+9A?, and hence for all processes 5 in 9, allowr(s,c + A9,9) is
true.
Since 9' ? Gq:,and x belongs to both groups 9 and g', gfl 9' $ ?. By assumption
fl9,9, > f9,9' and hence there is at least one correct process in gfl g'
Let r be any correct process in 9 A gi Since q and r are correct, and both belong
to group 9', D?vDrCIg, D?vDCq?g?; thus the message m* ? D?vDrC
Since r is a correct process in g, b c Br\+9A9, and for all processes 5 in 9,
allowr(s, c + A9,g) is true (as we showed above). Since there is a message m* e
D?vDrC, x ? group(m*), and m* ? D?vD:C, process r strong-multicasts the message
c + A9,9, (b, NAK)).
Since p and r are both correct, and allowr(r, c + A9, 9) is true, Lemma 8.25
states that allowp(r, c + A9, 9) is true, and hence process p strong-delivers the mes-
sage (r, c + A9,9, (b, NAK)). Therefore, p does not atomically deliver the message
m, a contradiction.			0
Corollary 8.32 If a?rrect process p atomically delivers a message m, then for all
correct processes q1 DLvDt?S?(???)??Gq,????? ? D?vDqIGq,?c?m?
Lemma 8.33 If p atomically mvlticasts a message m to a groip 9 before atomt-
cally multicasting m1 to 9, and any correct process q atomically delivers m', then q
atomically delivers m before ?`
Proof: Let p, 9, rn and m' be as in the statement of the lemma. Suppose a
correct process q in 9 atomically delivers m'. From the protocol, q also atomically
delivers all of p's previous multicasts to 9 by the time it delivers m'. In particular,
q atomically delivers m before m'.			0
Theorem 8.34 Svppose that for all grovps 9 and 9', if gflg' $ ?, then ?9# >
f9,9', and that for all groups 9, n9 > 2f9 and f9 > 1. The protocol in Figure 8.7
`s a painuise atomic multicast protocol that prevents the contamination of correct
processes, tolerates omission failures and has a latency of 2A9 for each group 9.
Proof: Lemmas 8.28 and 8.29 imply that the protocol is a pairwise atomic multi-
cast protocol. From Corollary 8.32 and 8.33, we conclude that the protocol prevents
139
the contamination of correct processes.
8.5.3 Lower bounds
E]
Recall that we require that the processing of a multicast targeted to a particular
group does not involve any process that does not belong to that group. We formalize
this as follows.
A multicast protocol has the group locality property in a system 5, if there is a
constant L such that, for all executions of the protocol in system 5, the following
conditions hold. Let Qp denote the sequence of messages, ordered in timestamp
order, that are multicast to any group to which p belongs.
o+ If Q? is empty, then p does not send or receive any messages during the exe-
cution.
o+ SupposeQp=(mi,m2,.. ,m5,...?. Then:
--H p does not send or receive any messages until time ts(mi).
--H if tQpI = k, then p does not send or receive any messages after time
ts(m?) + L.
--H for all i 1 < i < IQpI, for all c, ts(rnj) + L < c < ts(mi+i), p does not
send or receive any messages at c.
Recall that in Chapter 7, we presented an atomic broadcast protocol that pre-
vented contamination, and required n > f. We show that all pairwise atomic multi-
cast protocols that prevent contamination and have the group locality property in a
system subject to omission failures require n9 > 2f9 in every group g.
Theorem 8.35 In systems subject to omission failures, any pairu'ise atomic multi-
cast that prevents the contamination of correct processes and has the group locality
property requires n9 > 2f9 for all groups 9.
Proof: The proof is by contradiction. Suppose there is a pairwise atomic multicast
protocol that prevents contamination in a system subject to omission failures even
if for some group g, ng <2f9.
Consider the system shown in Figure 8.8 with two groups g and g', where g C g'
and ng = 2f9. Suppose g is partitioned into two disjoint subsets, A and B, IA =
IBI = f9. Suppose p Ei A and q ? B. Suppose r is some correct process that is in
group g', but not in group 9.
The proof considers four different possible executions of the hypothetical pairwise
atomic multicast protocol. In all the executions, we assume that no message is
received in less than A time; thus either a message is received exactly A time after it
was sent, or it is never received. We also assume that no process halts.
140
Figure 8.8: Illustration of the lower bound on fault tolerance
In the first two executions (Scenarios 1 and 2), we assume that if a message is
supposed to be sent by a process in A to another process in A, then the message
is actually sent and is received. Similarly, if a message is supposed to be sent by
a process in B to another process in B, then the message is actually sent and is
received.
Both Scenarios 1 and 2 are executions of the protocol in which no messages are
atomically multicast until time c. Suppose that at time c, p atomically multicasts
m? to group ? and that q atomically multicasts mq to group g. Suppose that no
other messages are multicast.
First note that there are no multicasts to group 9'. Hence, the group locality
property implies that any process in gi that is not also in g (for example, process r)
does not send or receive any message during the entire execution.
Scenario 1: Suppose that the processes in A are faulty and those in B are correct.
Suppose that whenever a process a ? A sends a message to a process b Ei B, a
performs a send omission failure, and whenever b sends a message to a, a performs
a receive omission failure.
Since the processes in B are correct, it is clear that the processes in B must
atomically deliver mq. Let cB denote the latest time at some process in B delivers
message mq.
Since there is no communication between the processes in A and B, it is clear
that the processes in A do not atomically deliver mq.
Scenario 2: Now suppose that the processes in A are correct and those in B are
faulty. Suppose that whenever a process a ? A sends a message to a process b E B, b
141
performs a receive omission failure, and whenever b sends a message to a, 6 performs
a send omission failure.
Since the processes in A are correct, it is clear that the processes in A must
atomically deliver m?. Let CA denote the latest time at which some process in A
delivers message m?.
Since there is no communication between the processes in A and B, it is clear
that the processes in B do not atomically deliver m?.
Scenarios 1 and 2 are indistinguishable to processes p and q. To see this, recall
that the group locality property implies that r neither sends nor receives any messages
in both scenarios. Thus, p and q are only permitted to communicate with the other
processes in group 9. Since, there are f9 processes in each of the two sets A and
B, and there is no communication between processes in A and B, we conclude that
Scenarios 1 and 2 are indistinguishable to processes p and q. Therefore, by time
max(c?, CB) in both scenarios, p atomically delivers m? and does not atomically
deliver mq, and q atomically delivers mq and does not atomically deliver m?.
Now consider two new scenarios as follows. Scenarios 3 and 4 are identical to
Scenarios 1 and 2 respectively until time c* = max(c + L + A, CA, CB) + 1. However,
in both Scenarios 3 and 4 p atomically multicasts a message m* to group g1 at time
c* + 1, and no further messages are multicast. Furthermore from time c* +1 onward,
no process performs a send or receive omission failure.
Since CA and CB are both less than c* (c* > max(c?, CB)), and Scenarios 1 and
2 are identical to Scenarios 3 and 4 respectively until time c*, the processes in A
deliver m? by time c* and omit mq, and the processes in B deliver mq by time C*
and omit m? in the latter two scenarios.
In Scenarios 3 and 4, since the first message multicast to group 9 is timestamped
c* +1, the group locality property implies that r does not send or receive any message
until time C* + 1. Since C* > c+ L + A, r does receive any message that a process in 9
sent by time C+ L. The group locality property also implies that no process in 9 sends
or receives a message between C + L and C* + 1. Since there are no omission failures
from time C* + 1 onward, we conclude that process r cannot distinguish between
Scenarios 3 and 4.
Since q is correct in Scenario 3, and p delivered mp and not mq, and q delivered
mq and not m?, it is clear that p is inconsistent at time C* + 1. Thus, by atomically
delivering the message m*, the correct process r would become contaminated; hence
r does not atomically deliver m*.
Since Scenarios 3 and 4 are indistinguishable to r, r does not deliver m* in Sce-
nario 4. However, in Scenario 4, the validity and agreement properties of the pairwise
142
atomic multicast imply that r must atomically deliver m* This is the required con-
tradiction.			0
8.6 Pairwise atomic multicast with no
inconsistency
Our protocol that prevents inconsistency with respect to pairwise atomic multicasts
is derived from the first protocol that prevents contamination.
To atomically multicast a message m = (p, c, 9, data) to group 9 at time c, process
p multicasts (basic, m) to 9'
If process q delivers (basic,m), by time c + Ag, then q sends the message
m, A11MqC) to all in 9, where AflMCq is the set of all messages that q has basic-
delivered by time c. Informally, as in the simple atomic broadcast protocol that
prevented inconsistency (see Chapter 5), the set AllMCq contains all the messages
that q intends to atomically deliver before it atomically delivers m.
At time c + A9 + A, q checks to see if, for every group 9' to which it belongs, a
majority of processes in the intersection of 9 and 91 agree with it on the sequence of
messages that should be atomically delivered in 9' before m is atomically delivered.
If so, then q atomically delivers (p, c, 9, data); otherwise, q is faulty, and to avoid
becoming inconsistent, q halts.
The protocol is formally presented in Figure 8.9. The proof of correctness is
similar to the proofs of correctness of our first pairwise atomic protocol that pre-
vented contamination (Figure 8.5) and our non-optimal atomic broadcast protocol
that ensured consistency (Figure 7.3), and hence is omitted.
Theorem 8.36 Suppose for all 9roups 9 and 9', if9fl91 $ ?, then fl99' > 2f9,9?.
The protocol in Fi?ure 8.9 is a pairu'ise atomic multicast protocol that prevents in-
consistency, tolerates omission failures and has a latency of Ag + A in each 9roup
9.
143
p atomically multicasts a message m = (p, c, 9, data) at time c to group 9*/
To multicast (pairwise atomic, m = c, 9, data))
p multicasts (basic,m)
p delivers basic multicast messages at c
MpC = fm ts(m) = c --H Agro?p(b), P delivered (basic,m)?
AllMpC = Uc'<c MpC
1* p sends messages at c
For all m ? MpC p sends (p, m, AllMpC) to all in group(m)
p receives all messages at c
For all m ? MpC?A for all groups 9 ? belongs(p)
ackp,,n,9 = (qj p received (q, m, A1jMqC?A),
and Vm*,group(m*) = g,m* ? AUMCq?A ? m* ? AUMpC?A?
p atomically delivers messages at c *1
For all m ? MpC?A, 9* = group(m)
if for any group 9 ? belongs(p) : lackp,m,g I <n99e --H f9,9* then halt
D =--H sort MpC?A in timestamp order
p delivers (pairwise atomic, D)
Figure 8.9: Pairwise atomic multicast protocol that prevents inconsistency
Executed by process p at time c
Appendix A
Other Fault-tolerant Broadcasts
In this appendix, we present sufficient conditions to prevent inconsistency and/or
contamination with respect to reliable broadcast, causal broadcast and causal atomic
broadcast. We only consider the prevention of D-inconsistency, VBD?inconsistency
and BD-contamination. Note that as with atomic broadcast, BD-consistency cannot
be ensured with these broadcasts.
A.1 Reliable broadcast
Informally, reliable broadcast requires that all correct processes deliver the same set
of messages, and this set must include all messages broadcast by correct processes.
Formally, reliable broadcast is defined to be the conjunction of the validity, agreement
and integrity properties (defined in Chapter 2 and repeated in Chapter 5).
We use the following notation (defined in Chapter 5 and repeated below). Suppose
B (P, ?) is a broadcast history. BCASTpC(B) denotes the set of messages that process
p broadcasts by time c in B; formally, BCASTpC(B) = Uc'<c P(p, c'). D?vDpC(B) denotes
the sequence of messages that p delivers by time c (defined in Chapter 2). DLVDpC(B)
denotes the set of messages that p delivers by time c; DLVDpC(B) = frn m ?
DLVDpC(B)? The superscript c is omitted when referring to all the messages broadcast
or delivered during a complete history.
Lemma A.1 Let B = (P, ?) Ei 13 and F be a failure set, such that B satisfies reliable
broadcast with respect to F.
o+ All correct processes deliver the same set of messages:
Vp ? F, vq ? F: DLVDp(B) = DLVDq(B)
o+ If a correct process broadcasts a message m1 then all correct processes eventu-
ally deliver m:
144
145
Vp ? F, Vq ? F: BOASTp(B) C DLVDq(B).
Proof' Directly from the definition of reliable broadcast.
0
Suppose B is a broadcast history and F is a failure set such that B satisfies
reliable broadcast with respect to F. DLv?(B) and DLVDF(B) respectively denote
the sequence of messages and the set of messages that the correct processes deliver
during B.
Lemma A.2 Let A E A, B be the broadcast history of A, and F be such that B
satisfies reliable broadcast with respect to F. If for all processes p, for all times c,
DLVDpC(B) C DLVD?F(B), then A is a D-consistent history with respect to reliable
broadcast and F
Note that although an application history A may be a D?consistent history with
respect to reliable broadcast and a failure set F, there may be a faulty process p, and
a time c, such that DLVDpC(B) ? DLVDF(B) Informally, this occurs when p delivers
a message at time c that no correct process delivers, and then halts before changing
state at time c.
Lemma A.3 Let A e A, B be the broadcast history of A, and F be such that B
satisfies reliable broadcast with respect to F. If B satisfies the fifo order property
with respect to F, and for all processes p, for all times c, DLVDpC(B) C DLVDF(B),
then A is a VBD-consistent history with respect to reliable broadcast and F
Lemma A.4 Let B = (p, S) e 13 and F be such that B satisfies reliable broadcast
with respect to F. Suppose B satisfies fifo order with respect to F. A correct process
p is BD-contamination-free until c with respect to reliable broadcast and F if and only
if:
o+ ifp delivers any message m at or before time c, then DiNDt?s7?mrn)?(B) C DLVDp(B)
A.2 Causal broadcast
Reliable broadcast does not impose any requirement on the order of message delivery;
thus, if the broadcast of a message m causally precedes the broadcast of a message
m' [Lam78], a correct process may deliver m before m' even though another correct
process delivers ?? before m. We now define causal broadcast, which orders message
delivery in a way that is consistent with the causal precedence relation. That is, if
the broadcast of a message m causally precedes the broadcast of a message m', and
a correct process delivers both m and m', then it delivers m before m'.
146
Causal broadcasts are sufficient for many applications [Sch88], and they are cen-
tral to many experimental systems such as Isis [BCJ+90], the Lazy Replication system
[LLS9o], and Psync [PBS89].
Formally, a causal broadcast is defined to be the conjunction of the validity,
agreement and integrity properties, as well as the causal order property defined
below for a broadcast history B and a failure set F:
o+ Causal order: For all correct processes p, if p delivers messages m and m',
and (bc(m),ts(m)) ?B (bc(m'),ts(m')), then p delivers m before m'
Lemma A.5 Let A ? A, B be the broadcast history of A7 and F be such that B
satisfies causal broadcast with respect to F. If for all processes p, for all times c:
o+ DLVDpC(B) C DLVDF(B), and
o+ if message ? D?vDpC(B)7 and there is a message m such that
(bc(m),ts(m)) HB (bc(m'),ts(m')), then p delivers m before m1
then A is a VBD-consistent history with respect to causal broadcast
and F
Lemma A.6 Let B = (P, ?) E 13 and F be such that B satisfies causal broadcast
with respect to F. A correct process p is BD-contamination-free until c with respect
to causal broadcast and F if and only if:
o+ if p delivers any message m = (q, c', --H) at or before time c, then:
- DLVDqCI(B) C DLVDp(B)7 and
- if m' ? D?vDCql(B)7 and there is a message m" E DLVDp(B) such that
(bc(m11),ts(m11)) ?B (q,c'), then q delivers ?Ii before r%I
A.3 Causal atomic broadcast
Neither atomic broadcast nor causal broadcast is a more stringent specification than
the other, In particular, ensuring that atomic broadcast's total order property is
satisfied is not equivalent to ensuring that causal broadcast's causal order property
is satisfied. Causal atomic broadcast is a fault-tolerant broadcast that is stronger
than both atomic broadcast and causal broadcast; formally, it is defined to be the
conjunction of the validity, agreement, integrity, total order and causal order prop-
erties.
Lemma A.7 Let A ? A, B be the broadcast history of A, and F be such that B
satisfies causal atomic broadcast with respect to F. Iffor all processes p, for all times
o+ D?vDpC(B) ? ??viy(B) then A is a VBD-consistent history (and hence also
D-consistent history) with respect to causal atomic broadcast and F
147
Lemma A.8 Let B = (p, ?) ? B and F be such that B satisfies causal
broadcast with respect to F. A correct process p is BD-contamination-free
with respect to causal broadcast and F if and only if:
o+ ifp delivers any message m at or before time c, then
atomic
until c
? DLVDp(B).
Bibliography
[BCJ+90j Kenneth P. Birman, Robert Cooper, Thomas A. Joseph, Kenneth P.
Kane, and Frank Bernhard Schmuck. 1515 - A Distributed Programming
Environment, June 1990.
[BGT90]
[BJ87]
[BSS9O]
[CASD85]
[Cri87]
[DRS86]
Navin Budhiraja, Ajei Gopal, and Sam Toueg. Early-stopping distributed
bidding and applications. In Proceedings of the Fourth International
Workshop on Distributed Algorithms. Springer-Verlag, September 1990.
In press.
Kenneth P. Birman and Thomas A. Joseph. Reliable communication
in the presence of failures. ACM ?ansactions on Computer Systems,
5(1):47--H76, February 1987.
Kenneth P. Birman, Andre Schiper, and Pat Stephenson. Fast causal
multicast. Technical Report 90-1105, Department of Computer Science,
Cornell University, April 1990.
Flaviu Cristian, Houtan Aghili, H. Raymond Strong, and Danny Dolev.
Atomic broadcast: From simple message diffusion to Byzantine agree-
ment. In Proceedings of the Fifteenth International Symposium on Fault-
Tolerant Computing, pages 200--H206, June 1985. A revised version appears
as IBM Research Laboratory Technical Report RJ5244 (April 1989).
Flaviu Cristian. Issues in the design of highly available computing ser-
vices. In Annual Symposium of the Canadian Information Processing
Society, pages 9--H16, July 1987. Also IBM Research Report RJ5856, July
1987.
Danny Dolev, Riidiger Reischuk, and H. Raymond Strong. Early stop-
ping in Byzantine agreement. Technical Report RJ5406, IBM Research
Laboratory, December 1986.
Michael J. Fischer, Nancy A. Lynch, and Michael 5. Paterson. Impos-
sibility of distributed consensus with one faulty process. Journal of the
ACM, 32(2):374--H382, April 1985.
[FLP85]
148
149
[GSTC9Oj
[GT89]
[GT9l]
[Had84]
Ajei Gopal, Ray Strong, Sam Toueg, and Flaviu Cristian. Early-delivery
atomic broadcast. In Proceedings of the Ninth ACM Symposium on Prin-
ciples of Distributed Computing, pages 297--H310, August 1990.
Ajei Gopal and Sam Toueg. Reliable broadcast in synchronous and asyn-
chronous environments (preliminary version). In J.-C. Bermond and
M. Raynal, editors, Proceedings of the Third International Workshop on
Distributed Algorithms, volume 392 of Lecture Notes on Computer Sci-
ence, pages 110--H123. Springer-Verlag, September 1989.
Ajei Gopal and Sam Toueg. Inconsistency and contamination (prelimi-
nary version). In Proceedings of the Tenth ACM Symposium on Principles
of Distributed Computing, pages 257--H272, August 1991.
Vassos Hadzilacos. Issues of Fault Tolerance in Concurrent Computa-
tions. Ph.D. dissertation, Harvard University, June 1984. Department of
Computer Science Technical Report 11-84.
[Lam78] Leslie Lamport. Time, clocks, and the ordering of events in a distributed
system. Communications of the ACM, 21(7):558--H565, July 1978.
[Lam84] Leslie Lamport. Using time instead of timeout for fault-tolerant dis-
tributed systems. ACM Transactions on Programming Languages and
Systems, 6(2):254--H280, April 1984.
[LLS9O]
Rivka Ladin, Barbara Liskov, and Liuba Shrira. Lazy replication: Ex-
ploiting the semantics of distributed services. In Proceedings of the Ninth
ACM Symposium on Principles of Distributed Computing, pages 43-58,
August 1990.
[LSP82] Leslie Lamport, Robert Shostak, and Marshall Pease. The Byzantine
generals problem. A CM Transactions on Programming Languages and
Systems, 4(3):382--H401, July 1982.
[NT90]
Gil Neiger and Sam Toueg. Automatically increasing the fault-tolerance
of distributed algorithms. Journal of Algorithms, I1(3):374?19, Septem-
ber 1990.
[PBS89] L. L. Peterson, N. C. Bucholz, and Richard D. Schlichting. Preserving
and using context information in interprocess communication. In A CM
Transactions on computer systems 7,3, pages 217--H246, August 1989.
Kenneth J. Perry. Early Stopping Protocols for Fault- Tolerant Distributed
Agreement. Ph.D. dissertation, Cornell University, February 1985. De-
partment of Computer Science Technical Report 85-662.
[Per85]
150
[PT84]
[Sch86]
Kenneth J. Perry and Sam Toueg. An authenticated Byzantine generals
algorithm with early stopping. Technical Report 84-620, Department of
Computer Science, Cornell University, June 1984.
Fred B. Schneider. The state machine approach: a tutorial. Technical
Report 86-800, Department of Computer Science, Cornell University, De-
cember 1986. Revised June 1987.
Frank Bernhard Schmuck. The Use ofEfficient Broadcast Protocols in
Asynchronots Distrib?ted Systems. Ph.D. dissertation, Cornell Univer-
sity, August 1988. Department of Computer Science Technical Report
88-928.
[Sch88]
