BIB-VERSION:: CS-TR-v2.0
ID:: CORNELLCS//TR94-1423
ENTRY:: 1994-06-27
ORGANIZATION:: Cornell University, Computer Science Department
LANGUAGE:: English
TITLE:: Verifying Programs That Use Causally-Ordered Message-Passing
AUTHOR:: Stoller, Scott D. 
AUTHOR:: Schneider, Fred B. 
DATE:: May 1994
PAGES:: 26
ABSTRACT:: 
We give an operational model of causally-ordered message-passing primitives. 
Based on this model, we formulate a Hoare-style proof system for 
causally-ordered delivery. To illustrate the use of this proof system and to 
demonstrate the feasibility of applying invariant-based verification 
techniques to algorithms that depend on causally-ordered delivery, we verify 
an asynchronous variant of the distributed termination detection algorithm of 
Dijkstra, Feijen, and van Gasteren.
END:: CORNELLCS//TR94-1423
BODY::
Verifying Programs That Use
Causally-Ordered Message-Passing
Scott D. Stoller*
Fred B. Schneider**
TR 94-1423
May 1994
Department of Computer Science
Cornell University
Ithaca, NY 14853-7501
This author is supported by an IBM Graduate Fellowship.
** This author is supported in part by the Office of Naval Research under contract
NOOOl 4-91 -J-1 219, the National Science Foundation under Grant CCR-9003440,
DARPNNSF Grant No. CCR-9014363, NASNDARPA grant NAG-2-893, and AFOSR
grant F49620-94-1-0198. Any opinions, findings, and conclusions or
recommendations expressed in this publication are those of the author and do not
reflect the views of these agencies.
Veri?ing Programs that use Causally-ordered Message-passing
Scott D. Stoller*			Fred B. Schneidert
Department of Computer Science
Cornell University
Ithaca, New York 14853
May 15, 1994
Abstract
We give an operational model of causally-ordered message-passing primitives. Based on this
model, we formulate a Hoare-style proof system for causally-ordered delivery. To illustrate the
use of this proof system and to demonstrate the feasibility of applying invariant-based verification
techniques to algorithms that depend on causally-ordered delivery, we verify an asynchronous
variant of the distributed termination detection algorithm of Dijkstra, Feijen, and van Gasteren.
This author is supported by an IBM Graduate Fellowship.
tThis author is supported in part by the Office of Naval Research under contract N00014-91-J-l219, the National
Science Foundation under Grant CCR-9oo344o, DARPA/NSF Grant No. CCR-9014363, NASA/DARPA grant NAG-
2-893, and AF0SR grant F49620-94-1-0198. Any opinions, findings, and conclusions or recummenoations expre??d
in this publication are those of the author and do not reflect the views of these agencies.
1 Introduction
Causally-ordered delivery can be understood as a generalization of FIFO ordering [vR93]. In both,
a message is delivered only after all messages on which it may depend. With FIFO ordering, this
guarantee applies only to messages having the same sender; with causal ordering, this guarantee
applies to messages sent by any process. Additional motivation for and examples of the use of
causally-ordered delivery can be found in [Bir93, vR93].
This paper gives a proof system for causally-ordered delivery. Our proof system is similar
in style to the satisfaction-based logics for synchronous message-passing in [LG81], for ordinary
asynchronous message-passing in [SS84], and for flush channels in [CKA93]. We assume familiarity
with the terminology of that literature.
Reasoning about message-passing primitives for causally-ordered delivery involves a global prop-
erty: the system-wide causality relation, which defines what messages are deliverable. This dis-
tinguishes causally-ordered delivery from the types of message passing for which axiomatic se-
mantics have already been given (e.g., [LG8l, SS84, CKA93]). And, our work demonstrates that
substantially new methods are not required when message-delivery semantics depends on global
information.
A program proof in a satisfaction-based logic involves discharging three obligations:
(1) a proof outline characterizes execution of each process in isolation,
(2) a "satisfaction proof" validates postconditions of receive statements, and
(3) an interference-freedom proof establishes that execution of no process invalidates an as-
sertion in another.
Our proof system for causally-ordered message-passing is similar, except step (2) is merged with
step (1). (Such a merging is also possible for other satisfaction-based proof systems that handle
asynchronous communication primitives, like the logics of [5584] and [CKA93].)
The remainder of the paper is organized as follows. Section 2 defines causally-ordered message-
passing. Our proof system is the subject of Section 3. In Section 4, we use the proof system
to verify an asynchronous variant of the distributed termination detection algorithm of Dijkstra,
Feijen, and van Gasteren [DFvG83]. Section 5 contains some conclusions.
2
2 A Model of Causally-ordered Message-passing
We give an operational semantics for causally-ordered message-passing primitives by translating
programs containing these primitives into a generic concurrent programming language that has
shared variables. The shared variables represent the state of the network.
Processes communicate by sending and receiving messages. To encode the restrictions implicit
in causally-ordered delivery, each message sent is modeled in our translation by a triple (d, i, t),
where1
d is the data being sent by the program,
i is the name of the process2 that sent the message, and
t is a timestamp that contains information used to determine whether the message is ready
for delivery.
The following functions are useful in connection with messages represented by triples.
data((d,i,t)) =? d
sender((d,i,t?) =
ts((d,i,t?) A
Two shared variables a? and p? are associated with each process i. Variable Q contains the
(triples modeling) messages sent to process i; p? contains the (triples modeling) messages process z
has received.
There is an obvious and seemingly simpler alternative to using variables Jj and p?. It is to use
a single variable Xi (say), where the value of Xi is the set of messages sent to process i but not
yet received (i.e., Xi equals ? --H pi). The model we use has two advantages over this one-variable
model. First, in our model, proving interference freedom (defined in Section 3) is easier. This is
because no process can falsify m E a? or m E Pi; predicate m C Xi would be invalidated by the
receiver. Second, proofs of some programs (such as the example in Section 4) involve reasoning
about communications history. That history is available in a? and pi but is not available in Xi.
Causally-ordered delivery restricts when a message can be received. This is achieved in our
1An actual implementation of causally-ordered delivery might not require a sender name i or timestamp t. That
information is used here to abstract from the details of all real implementations.
2Processes are named 0,1N --H 1, and hereafter identifiers i,j, k, and p range over process names.
3
translation by defining a well-founded partial order ? on timestamps. Our definition of ? is based
on the theory of [Lam78j. A system execution is represented as a tuple of sequences of events;
each sequence corresponds to the execution of a single process. An event is a send event, a receive
event, or an internal (i.e., non-communication) event. The happens-before (or "potential causality")
relation for a system execution is the smallest transitive binary relation on the events in that
execution such that:
o+ If e and e' are performed by the same process and e occurs before e', then e H
o+ If e is the send event for a message m and e' is the receive event for that message, then e
Causally-ordered delivery is formalized in terms of as follows [BSS9l]. Let send(m) and
receive(m) respectively denote the send event and receive event for a message m.
Causally-orderedDelivery: If m andm' are sent to the same process and send(m)
send(m'), then receive(m) receive(m').3
To implement Causally-ordered Delivery using timestamped messages, the timestamps and ?
are chosen to satisfy
ts(m) ? ts(m') iff send(m)			send(m').			(1)
Causally-ordered Delivery is then equivalent to requiring that a message m1is received by a process
p only after p has received all messages m sent to p for which ts(m) ? ts(m') holds.
One way to achieve (1) is to use vector clocks [Fid88, Mat89]. Here, a vector Vtj of type
array[0..N --H 1] of Nat is associated with process i, where Vtj satisfies:
Vector Clock Property: vt?\j] is the number of send events that are performed by
process j and causally precede the next event to be performed by process i.
Partial order ? is defined in terms of vector clocks, as follows.
vti # vt? A (?i : vti[i] # vt?[i])
vti ? vt? A (Vi : vti[i] < vt?[i]) A vti # vt?
3FIFO delivery can also be formalized in terms of ?+. FIFO delivery ensures that if m and m' are sent by the
same process, to the same process, and send(m) send(m'), then receive(m) receive(m'). The close analogy
between FIFO delivery and causally-ordered delivery should now be evident.
4
Three rules define how the Vtj are updated in order to maintain the Vector Clock Property.
Since only send events and receive events are of interest, vector clocks are updated only when
send and receive statements are executed. Let inc(vt, i) denote vector vt with the jth component
incremented by one. The rules are:
Initialization Rule: Initially, vti[j] = 0 for all i and j.
Send Update Rule: When process i sends a message m, it updates vt1 by executing
Vtj := inc(vt?,i)
and includes updated vector Vtj as the timestamp attached to m.
Receive Update Rule: When a process i receives a message ni, it updates vt1 by
executing
vt1 := max(vt?, ts(m)),
where max(vt, vt') is the component-wise maximum of the vectors vt and vt'.
We now give our translation of send and receive statements into statements that read and write
shared variables a? and p?. The following notation is used to describe the multiple-assignment
[Gri76j of el to xl, e2 to x2, ..., and e? to x?:
el
x2			e2
A send statement send e to i in process j is translated into:
Vtj ?nc(vt?,j)
a? e (e,j, inc(vt?,j)?
(2)
where s ? x =A ? ?
The translation of a receive statement requires a conditional delay. Statement await B then S
delays until B holds and then executes 5 as a single indivisible operation starting from a state
that satisfies B. A receive statement receive x in process i delays until a message is available
5
for receipt and then updates x, p?, and Vtj. In particular, to ensure causally-ordered delivery,
receive x delays until there exists some message m that has been sent to i but not received and
such that all messages m' that have been or will be sent to i for which ts(m') ? ts(m) have been
received.
For a set A of triples modeling messages, choose(A) and minset(A) are assumed to satisfy
choose (A) ? A provided A # ?
minset(A) A c A I (Vm' E A: ?(ts(m') ?
(3)
(4)
A receive statement receive x in process ? is translated as follows, where m? is a fresh variable.
await a? --H p? # ? then ni, := choose(minset(? --H pi))
x := data(m?)
Vtj := max(vti, ts(mi))
?
(5)
To show that code fragments (2) and (5) correctly implement Causally-ordered Delivery, con-
sider some message m that is received by a process t. We must show that no message m' subse-
quently received by process i satisfies send(m') send(m). Suppose such a message m' exists.
By (1), ts(m') ? ts(m). Message m' could not be in (Yj when m is received, since m is selected
from among the elements of ? with minimal timestamps. Thus, m' must be added to a? after m
is received. We show that this is impossible by proving: For all messages m and m', if m' is added
to a? after m has been added, then ?(ts(m') ? ts(m)).
First, observe that the following holds throughout execution of a program.
(Vj,k : vt,[k] < vtk[k] A (Vrn E o? : ts(rn)[k] < vtk[k]))
(6)
Initially, this holds because for all j and k, Vtj [k] 0 and ? = ?. Only send and receive statements
change the values of these variables, so it suffices to show that our translations of these statements
preserve (6), which is easily done.
Finally, we show that ?`(ts(rn') ? ts(rn)). This is implied by (?k : ts(rn')[k] > ts(rn)[k]), which,
in turn, follows from ts(rn')[j] > ts(rn)[i] where j is the sender of rn'. The latter holds because
ts(rn')\j] = vti[j] + 1 > vtj\j] > ts(m)[j], where the equality follows from the translation of send
6
statements, the strict inequality follows from standard arithmetic, and the nonstrict inequality
follows from (6).
3 Axioms for Send and Receive
We can now present Hoare-style axioms [Hoa78] for the send and receive statements described
above.
Given the above translation of send e to ? into a multiple-assignment statement, we use the
multiple-assignment axiom [Gri76] to obtain an axiom for the send statement. The notation e[xi
el, ..., X? e?] denotes the simultaneous substitution of each term e? for the corresponding
variable Xj in a term e. Validity of the following triple follows immediately from the multiple-
assignment axiom:
fP[vtj inc(vtj,j), a?' :--H--H <y? e (e,j, tnc(vt5,j)?]?
Vtj __ inc(vtj,j)
? (e,j, inc(vt>,j)?
Thus, we have
Send Axiom: For a send statement in process j:
fP[vt5 := inc(vt?,j), a? := a? ? (e,j, inc(vtj,j)?]1 send e to i ?P?
(7)
An inference rule for receive statements is obtained using translation (5) of receive x. Using
axiom (3) for choose, the usual rules for assignment and sequential composition, and this inference
rule for await statements [OG76]
Await Rule:
1PABYS?QY
?P? await B then S ?Qy			(8)
we can show that ?P? receive x ?QJ is valid iff the following Predicate Logic formula is valid:
PA m? E minset(a? --H pi)
? Q[x := data(m?), Vtj := max(vti, ts(m?)), p? := Pi e m?].
Thus, the inference rule for receive statements is
7
Receive Rule: For a receive statement in process j:
P A m? E minset(a? --H pi)
=> Q[x data(mi), Vtj max(vti, ts(m?)), p? p.' ?
fP? receive x
(9)
Interference Freedom
The preceding rules for send and receive, together with rules for other statements and the usual
miscellaneous rules of Hoare logics (e.g., the Rule of Consequence), can be used to construct a proof
outline for each process in isolation. A proof onijine is a program annotated with an assertion before
and after every statement. A proof outline characterizes the behavior of a process assuming that
no other process invalidates assertions in that proof outline. The proof outlines for processes that
execute concurrently are combined to obtain a proof outline for the entire system by showing
interference freedom [0G76j--H that no process invalidates assertions in the proof outline of another
process.
In a proof outline PO, the assertion that precedes a statement S is called the precondition of
S and is denoted pre(S), the assertion that follows a statement 5 is called the postcondition of 5
and is denoted post(S), and we write pre(PO) and post(PO) to denote the first and last assertions,
respectively, in PO. We write fP1 Po ?Q? to denote the triple obtained by changing pre(PO) to
P and post(PO) to Q.
An assertion P appearing in a proof outline POj is interference free with respect to proof
outlines ....... , PON if for all assignments, sends, and receives 5 in a different proof outline than
?P A pre(5)? 5 ?P1			(10)
is valid. This is because (10) asserts that execution of 5 does not invalidate P. Assignment to
variables is the only way to invalidate an asseftion.4 Since our translations for send and receive
contaln assignments, the interference freedom obligations require checking (10) for each send and
receive statement, as well as for each assignment to an ordinary program variable.
Proof outlines POi,. . , PON are interference free if all assertions P in the proof outlines are
interference free in Po1... . , PON. This leads to the following inference rule.
4This is actually an assumption about the assertion language. For exarnple, it rules out allowing control predicates
in assertions.
8
Parallel Composition Rule:
POl,..,P0N.......,PONareinterferencefree			(11)
[A?pre(PO?)Y [P01 ... II PON] [A1 post (POj)Y
Note that, in contrast to the logics for asynchronous communication in [5584] and [CKA93],
our parallel composition rule does not have a "satisfaction" obligation. This is not an artifact of
causally-ordered message-passing; the logics of [SS84] and [CKA93] could be similarly formulated.
4 Example: Distributed Termination Detection
To illustrate our proof rules, we give a proof outline for the termination detection algorithm of
[DFvG83]. Validity of this proof outline shows that the algorithm correctly detects quiescence in
systems of processes that communicate using causally-ordered message-passing. Our proof outline
is based on the correctness argument given in [DFvG83j, modified for causally-ordered delivery
instead of the synchronous communication assumed there.5
The algorithm is intended for use in systems where processes behave as follows: At each instant,
a process is either active or q?iescent, where the only action possible by a quiescent process is receipt
of a message. A quiescent process may become active upon receipt of a message; an active process
becomes qulescent spontaneously. Each process i has the form
Ifliti
do
U
od
9ij			send e?? to j
Sij
receive Xj
(12)
where the 9ij are boolean expressions, and Ifliti, Si>, and 14- are statements that do not contain
communication statements. Such a process i is quiescent iff each guard 9ij is false. This is formalized
by:
A ?(V9ii)
5
In the algorithm of [DFvG83] a token circulates among the processes. This introduces a new
kind of message, which we call a token message. To distinguish it from the messages in the original
5In [Apt86j, the partiai-correctness argument of [DFvG83] is formalized and some additional properties of the
algorithm are proven.
9
computation, hereafter called basic messages, we use a predicate istok(data(m)) that holds exactly
when m is a token message. Note that a process of the form (12) cannot send basic messages to
itself.6 Define:
a1?0? =A ?`n E a? istok(data(m))?
pttOk =A fm E Pi I istok(data(m))?
=A ?m E a? --H			?isThk(data(m)) A sender(m) =
The system is quiescent if every process is quiescent and no messages are in transit. Thus, the
system is quiescent iff the following predicate Q holds.
Q =A (Vi: q? A (Vj:
A color, either black or white, is associated with each process. For each process i, we introduce
a boolean variable bi such that bi is trne iff process i is black. The detection algorithm sets bi
to trtte when process i sends a basic message; its sets bi to false when i sends a token message.
Therefore, we can assert that bi holds if process i has a sent a basic message since it last sent a
token message. This is formalized as an assertion in terms of the following state function:7
lxi: The largest timestamp in ?m ? Uj a;?k sender(m) = ii, if such a timestamp exists;
otherwise 0.
The assertion is now formalized as:
Ji =A (Vi : (?j : (?m ? ??,> : lxi ? ts(m))) ? bi)
The algorithm proceeds as a sequence of rounds. One process serves as the initiator for all
rounds; it starts each round by sending a token message. Without loss of generality, assume
process 0 is the initiator. In each round, the token is received by every process exactly once, ending
with the initiator. We define the token to be at position i if it has been sent to process i and not
subsequently sent by process i; we say that the token visits a process when the token has been
received by but not sent from that process. For each process i, we introduce a new variable hi that
6This restriction is not needed for correctness of the algorithm; we adopt it here because simplifies the correctness
proof slightly.
7The name iris a mnemonic for "last transmission" of the token by process ?.
10
is trie iff the token is visiting process ?.
In each round, the token visits the processes in descending order by process name. Thus, the
token visits process N --H 1, N --H 2, ..., 0, and the current token position is given by the state
function:
tp =A i --H 1
N --H 1 otherwise
if (Vj ? i: lxj ? lr?)
Note that all arithmetic on process names is modulo N.
An assertion Jtok says that the N most recent sends of token messages are totally ordered by
causality. This is equivalent to stipulating that the timestamps on these token messages form an
ascending sequence; for example, if tp # N --H 1, then txtp ? txtp--Hi ? ... ? ix? ? txN?1 ? &N--H2 ?
... ? lXtp+i. Formally,
Jt0k =? (V? # tp : 1?i+1 ? lxi)
An assertion relating the timestamps of token messages to the timestamps of basic messages is
also needed. For this, we use an assertion Jbas, whose informal interpretation is as follows.
Let m be a basic message sent from i to k that was sent before the ?th transmission of the
token by the sender. If m was sent in the same direction that the token travels (i.e., if k < ?),
then m must be delivered before the ath transmission of the token by the receiver. If m was sent
in the other direction (i.e., if i < k), then m must be delivered before the (a + 1)st transmission
of the token by the receiver. J6as holds throughout execution of the algorithm because causally-
ordered message-passing is used for all messages--Hthe values of timestamps are consistent with this
ordering. We formalize the assertion using an additional state function.
n&?: The second largest timestamp in fm E Uj a,t.ok sender(m) = i?, if such a timestamp
exists; otherwise 0.
Jbas A (Vi,k : Vm E ?`.,k :			(13)
< tp < i =i nlrj ? ts(rn))
A (k <i A ?i(k < tp <?) ? lxi ? ts(m))
A (i < tp < k => lxi ? ts(m))
A (i < k A ?`(i < tp < k) ? nixi ? ts(m)))
Assertions Ji, Jbas, and Jt0k contain all of the information about message-delivery order needed
11
for correct operation of the algorithm. We encapsulate this information as a single assertion J:
J A ?1 A J6as A i?tok
As with processes, a color, either black or white, is associated with the token. The color of the
token is represented as before-black is encoded as true, and white is encoded as false. While in
transit, this boolean value is included in each token message; while the token is visiting a process
i, a new variable tj is used to store the color of the last token message received by process i.
Given a boolean value c, mktok(c) denotes a token value whose color is c. The color of the
token is extracted using a selector tokval. Thus, istok(mktok(c)) = true and tokval(mktok(c)) =
In each round, the token is initially white. It becomes black (if it isn't already) when it visits a
process i (i.e. hi equals true) that is black (i.e. bj equals true). Thus, the token becomes black
when it visits a process that has sent a basic message since last sending a token message, and the
current token color is given by:
A
tc =
ttp V btp			if htp
tokval(data(m)) if ?htp A rn C attpok A ts(m) = lXtp+i
true			otherwise
We also add to each process i a new variable Ui, which is used for temporary storage of received
values.
When the token returns to the initiator, if either the initiator or the token is black, then the
initiator starts another round. If both are white, then the system is quiescent (i.e., Q holds).8 This
fact is implied in the proof outlines of Figure 1 by the Q in the precondition for the second branch
of the alternation statement RELAY0.
The operation of the algorithm is succinctly characterized by K, where K A K1 v K2 v K3 and:
K1 A (Vi > tp: q? A (Vk: Xi,k =
A(htp? (Vk > tp: Xtpk =?))
K2 A (?i < tp:
K3 A ic
8Here, the initiator does not take any special action when qujescence is detected. A round of communication cou'?
easily be added to notify each process that quiescence has been detected.
12
Informally, K1 says that every process visited by the token in the current round is quiescent and no
basic message sent by one of these processes is in transit. Moreover, if the token is visiting process
tp, then no basic messages sent by process tp are in transit to processes the token has visited in
this round. K2 says that some process not already visited by the token during the current round
is black. Finally, K3 says that the token is black.
Assertions J and K are not quite strong enough to prove correctness of the algorithm. An
assertion I that expresses several simple properties of the algorithm (e.g., that there is always at
most one token message in the system) is also needed. Thus, we define 1 A IA J AK, where
A (Vi: (I?Iatt.ok#p?kyI<1)
A(Vm ? a? : ts(m) ? Vtsender(m))
A(Vm E a?: ts(m) ? Vtj ? m E pi')
A(Vm E p?: ts(m) ? vtj)
A(Ja,tok ?pitOk I<?1)
A((hi ? aitok # p1iok) ? tp =
A(ht' ? (a1tok ? A (Vj : tok
= p;Pkb)
A(a1tok = ? Uja?tOk sender(rn) = i + 11)
A(totai(fm E U5a1 sender(m) =
A(total(Uja?Ok))
A(lXj ? vt1)
A(Xi,i =
and total(s) holds iff ?t (?m ? 5: t?(m) = t)) is totally ordered by ?.
Proof outlines for processes augmented to detect termination appear in Figure 1. The Appendix
contains a detailed justification of the proof outlines.
Angle brackets indicate that the enclosed statement is executed atomically [Lam8O] .9 Also,
communication statements may appear in guards, so we use the following proof rule for iteration
statements:
9Angle brackets are not actually necessary for correctness. They do simplify the proof slightly, so we have elected
to use them.
13
Proof Outline for Process i
?IA?hjAtp>iA(i=O?(Vj: ayk =?)y
INITi fi)
do
U 9ij			fIA9ij)
U
true fIA9ij Abi)
send e?? to j flAyij)
Sij fi)
receive y? fIA(?istok(?) ? K[q? :=false])
A(istok(yi) ? tp = i A ?hj A tc = Thkval(v?)))
if istok(yi) fi A tp = i A ?hj A tc = Thkval(y1))
(hi := true
t2, := tokval(?)? fi)
U mjstok(yj) -4 fiAK[qi :=false])
Xj := y? fiAK[qi :=false])
Ri fi)
q? A hj
od
fi'
fi fi)
fIAq?Ah?)
RELAYj fi)
INIT0 A send mktok(false) to N --H 1 fi A ?h0 A tp > 0)
mit0
RELAY0 A if (t0vb0)
fIAh0)
(send mktok(Jalse) to N --H 1
h0 :=false
b0 :=false? fI)
U ?t0vb0)			fIAQ)
(* quiescent *)
skip fI)
fi
For 0 <j <N:
INITj			A mit5
RELAYj A (send mkThk(t? v bj) to j --H 1
hj :=false
bj :=false?
Figure 1: ProofOutlines
14
Iteration Rule:
For i E [1..N], ?I A g?? Cj ?Pj? POj ?I1
1'Y
do
[J g?;Ci?
iE[1..NJ
fi?'
POj
fly
(14)
od
fI A ?(ViE[1..Nj 9i)Y
Here, 9? is a boolean expression and Ci is a receive or skip statement.10 One might expect there to be
an assertion between 9? and Cj in the rule's conclusion. Expression 9i, contains program variables
of only process i, so g? cannot be invalidated by execution of another process. In particular,
interference cannot occur even if evaluation of 9? and execution of Cj are not performed as a single
indivisible action. Thus, there is no need to make the assertion explicit.
To illustrate reasoning about receive statements, we give a detailed proof for the triple
fi) receive y? fI A (??stok(yi) ? K[qi := false]) A (istok(yi) ? tp = i A ?hj A te = tokval(yi)))
(15)
This triple arises as a hypothesis in the application of the Iteration Rule to the main loop of each
process. The triple expresses a crucial fact about the algorithm that activation of a process (i.e.,
the changing of qi to false) by reception of a basic message does not falsify K. By Receive Rule
(9), we can deduce (15) from
lAmi E a?--Hpi ? (IA(??istok(?) ? K[q? :=false])A(istok(yi) => tp = jA?h?Atc = tokval(yi)))'
(16)
where for any term t,
A t[yi := data(mi), Vtj := max(vti, ts(m?)), Pi := ? mi]
We show in the Appendix that 1 ? I' is valid. Here, we first show that
I A m? E ?i --H p? A ??stok(y1,) ? K[q? := false]'
10The guard "g; skip" is abbreviated "9"; the guard "trae; receive x', is abbreviated "receive x".
15
(17)
is valid. We assume the antecedent and prove the consequent. Note that
Kki false]' = (K1,[q? := false] v K2 v K3)
Thus, if K2 or K3 holds, then so does (17). Suppose neither K2 nor K3 holds. Since I holds
by assumption, K must also hold, so K1 must hold as well. We now show that in this case,
K11 [q? false] holds. First, note that K1, holds; this follows easily from the fact that K1 holds.
The proof proceeds by case analysis on the relative values of i and tp.
case i < tp: K11 does not depend on the q? `s for j < tp. Therefore, since Kj holds, so does
K11[q? :=false].
case i> tp: We show that this case is impossible. Let k =? sender(m?). From the antecedent of
(17) and the definition of Xk,i, we conclude m? ?
case k <tp: Instantiating the universally quantified variables i and k in J?a? with k and z,
respectively, we conclude (using the third conjunct of J6as) that lxk ? ts(m,). Using Ji,
this implies that bk holds, which implies that K2 holds. This contradicts the assumption
above that neither K2 nor K3 hold.
case k > tp: By assumption, K1 holds, so (Vj Xkj = ?), 50 Xk,i = ?. From the antecedent
of (17), we have ??istok(y,'-) (i.e., ?istok(data(m?))) and m? C Q --H p?, so by definition of
we have m? ? Xk,i, a contradiction.
Finally, consider showing that (istok(yj) ? tp = i A ?hj A te = tokval(y?))' holds whenever the
antecedent of (16) holds. This is equivalent to showing
IA m? E a1 --H p? A istok(data(m?)) ? tp = i A ?hj A tc = Thkval(data(m?))
(18)
We assume the antecedent and prove the consequent. From the antecedent, we conclude m? ?
tok			tok
--H p1t.ok Thus, a? ? ptt0k, so by conjunct (Vi (hi V a,tok ? p1tok) ? tp = i)) in 1, tp = i
holds. We next show, by contradiction, that ?hj holds. Suppose not; then hj holds, so (using I),
a1tok --H ptok which contradicts mi E a1tok --H p1t.ok Finally, we show that tc = tokval(data(m?)). From
?, a,tok --H p1tok < 1 thus m? is the only unreceived message in a,tok, ?? m? must have the largest
timestamp in a1tok, ?? ts(mi) = lri+1 This, together with ?hi, implies tc = tokval(data(m?)).
16
Comparison to Related Work
The first correctness argument applicable to this distributed termination detection algorithm in
an asynchronous setting is (to the best of our knowledge) an operational argument due to Raynal
and Helary [RH9O]. Proposition 3.8.1 in [RH90] establishes partial correctness assuming that the
message-delivery order satisfies a property P. Our proof assumes causally-ordered delivery, which
implies our predicate Jbas; Jbas is similar to but slightly stronger than property P of [RH9O].
Another operational (albeit more formal) proof, by Charron-Bost et at., appears in [CBMT92].
It shows correctness of this termination detection algorithm for systems that communicate using
causally-ordered message-passing. The proofs there differ considerably from the invariant-based
analysis of the synchronous case in [DFvG83j. In fact, Charron-Bost et aL claim that correctness
proofs for all algorithms that use causally-ordered delivery "must consider the execution as a whole,
rather than concentrate on assertions that remain invariant in each global state" ([CBMT92], p.
34). The existence of our proof, which is an invariant-based analysis, refutes this claim.
5 Conclusions
We have presented a Hoare-style proof system for causally-ordered delivery. Through an example,
we have demonstrated the feasibility of our approach to reasoning about causally-ordered deliv-
ery. The example, a distributed termination detection algorithm, has been treated using other
approaches, so there is now an opportunity to compare those approaches with the one in this
paper.
The fact that a correctness proof for causally-ordered delivery can be based closely on the
analysis of a synchronous version is a significant benefit of the approach discussed in this paper.
We support a two-step approach to verifying algorithms that use asynchronous message-passing
[Gri90]:
1. Verify a synchronous version of the algorithm (presumably a simpler task).
2. Modify the algorithm and the proof to obtain a correctness proof for the asynchronous version
of the algorithm.
One benefit of this two-step approach is that it leads naturally to a focus on and accurate determi-
nation of the ordering requirements needed by the algorithm. An interesting question is the extent
to which this approach can be made formal and systematic.
17
References
[Apt86] Krzysztof R. Apt. Correctness proofs of distributed termination algorithms. A CM
Transactions on Programming Languages and Systems, pages 388--H405, 1986.
[Bir93] Kenneth P. Birman. The process group approach to reliable distributed computing.
Communications of the ACM, 36(12), December 1993.
(BSS91j Kenneth Birman, Andre' Schiper, and Pat Stephenson. Lightweight causal and atomic
group multicast. ACM Transactions on Computer Systems, 9(3):272--H314, 1991.
[CBMT92] Bernadette Charron-Bost, Priedemaun Mattern, and Gerard Tel. Synchronous and
asynchronous communication in distributed computations. Technical Report LITP 92-
77, Institut Blaise Pascal, University of Paris 7, 1992.
[CKA93] Tracy Camp, Phil Kearns, and Mohan Ahuja. Proof rules for flush channels. IEEE
Transactions on Software Engineering, 19(4) :366--H378, 1993,
[DFvG83j E. W. Dijkstra, W. H. J. Feijen, and A. J. M. van Gasteren. Derivation of a termina-
tion detection algorithm for distributed computations. Information Processing Letters,
16:217--H219, 1983.
[Fid88] C. Fidge. Timestamps in message-passing systems that preserve the partial ordering.
In Proceedings of the 11th Australian Computer Science Conference, pages 56--H66, 1988.
[Gri76j David Gries. An illustration of current ideas on the derivation of correctness proofs and
correct programs. IEEE Transactions on Software Engineering, 2, 1976.
[Gri9O] E. Pascal Gribomont. Prom synchronous to asynchronous communication. In C. Rat-
tray, editor, Specification and Verification of Concurrent Systems: Proceedings of a 1988
BCS-FACS Workshop, volume 1 of FACS Workshop Series. Springer Verlag, 1990.
[Hoa78] C. A. R. Hoare. Communicating sequential processes. Communications of the ACM,
21(8):666--H677, 1978.
[Lam78] Leslie Lamport. Time, clocks, and the ordering of events in a distributed system.
Communications of the ACM, 21(7):558--H564, 1978.
18
[Lam8O] Leslie Lamport. The `Hoare logic' of concurrent programs. Acta Informatica, 14:21--H37
1980.
[LG81] Gary Marc Levin and David Gries. A proof technique for communicating sequential
processes. Acta Informatica, 15:281--H302, 1981.
[Mat89] Friedemann Mattern. Time and global states in distributed systems. In Proceedings of
the International Workshop on Parallel and Distributed Algorithms, Amsterdam, 1989.
North--HHolland.
[OG76]
[RH9O]
[SS84j
[vR93j
Susan Owicki and David Gries. An axiomatic proof technique for parallel programs I.
Acta Informatica, 6:319--H340, 1976.
Michel Raynal and Jean-Michel Helary. Synchronization and Control of Distributed
Systems and Programs. Wiley, 1990.
Richard D. Schlichting and Fred B. Schneider. Using message passing for distributed
programming: proof rules and disciplines. A CM Transactions on Programming Lan-
guages and Systems, 6(3):402--H431, 1984.
Robbert van Renesse. Causal controversy at le Mont St.-Michel. Operating Systems
Review, 27(2):44--H53, 1993.
A Proof of Correctness
We show that the proof outlines in Figure 1 are valid. We discuss only the triples for non-composite
statements. It is easy to prove validity of the proof outlines in Figure 1 using these results and the
inference rules for sequential composition, iteration, and alternation. The triples for non-composite
statements that arise in the proofs for each process in isolation are listed in Figure 2. Proving
invariance of I is stralghtforward, so we omit those detalls. For brevity, we sometimes content
ourselves with giving an informal explanation for why a triple is valid; based on this, the reader
should have little difficulty constructing a formal proof.
19
For 0 < i < N:
T1:
T2:
T3:
T4:
T5:
T6:
T7:
T8:
T9:
T10:
Til
T12
fIA?hjAtp> i)h?ititI)
?IA9ij)bi			true?IAy?? Abi)
fIA9?1Ab?)send e?? tojfIA9ij)
?I A :gij) Si> ?I)
?I) receive y? ?IA(?i5Thk(yi) => K[q?			false])
A(istok(y?) ? tp = i A ?hj A te = tokval(y?)))
?I A tp = i A ?hi A tc = tokval(y?)) (hi := true			ti			tokval(?)? fi)
?I A K[q? := false]) Xj := Yi ?I A K[q?			false])
?I A K[q?			false]) Ri ?I)
?I A qi A hi) (send mktok(t? V bi) to i --H 1			hi			false			bi			false? fi)
?I A ?h0 A tp > 0 A (Vj: %t.ok = ?)) send mktok(false) to N --H 1 ?I A ?h0 A tp > 0)
?I A ?h0 A tp > 0) Inito fI)
?I A h0) (send mktok(false) to N --H 1 h0 := false b0 := false? ?I)
Figure 2: Triples for non-composite statements.
A.1 Proof for Process i > 0 in Isolation
T1: ?IA?hiAtp>i)INITi?)
Since i> 0, INITi is Ifliti. J is unaffected by execution of Ifliti because Ifliti neither sends nor
receives messages. To see that K is also unaffected, note that the only variables that appear in K
and can be assigned by Ifliti are those appearing in q?, and that K is independent of q? for j < tp.
The precondition of T1 implies i < tp, so K is not invalidated by INITi.
T2: ?IAgij)bi:=true?IA9ijAbJ
J is unaffected by execution of this statement. Variable bi occurs only positively in K, so setting
bi to true never falsifies K. Finally, bi does not appear in 9ij, so the assignment to bi does not
falsify Yij.
T3: f1Ag??Ab?)send e?? t0jfIA9i>)
We prove invariance of J as follows. Ji is preserved because bi holds. Jt0k is unaffected because
the message being sent is not a token message. Let m denote the element added to a? by executing
20
this statement. To show that Jbas is preserved, it suffices to show that lxi ? ts(m) and fllxi ? ts(m)
hold, since J6as is' then satisfied regardless of which conjunct applies to this message. By definition
of the send statement, ts(m) = inc(vti,i), so (by definition of ?) Vti ? ts(m). lirom I, we have
lxi ? Vti, so by transitivity of ?, lxi ? ls(rn). It follows from the definitions of lx? and nixi that
nixi ? lxi, so by transitivity of ?, nlxi ? ts(m). Thus, Jbas is preserved.
The proof that K is preserved is by case analysis on the disjunct of K that holds initially.
case K1: In this case, tp> i must also hold, since i > tp and K1 imply qi, contradicting 9ij in the
precondition of T3. Since i <ip and bj hold, K2 also holds, so see that case.
case K2: K2 is unaffected by execution of this statement, so K2 still holds after execution of this
statement.
case K3: K3 is unaffected by execution of this statement, so K3 still holds after execution of this
statement
T4: ?IA9ij1Sij?IY
J is unaffected by execution of Sij because Sij neither sends nor receives messages. The only
variables that appear in K and can be assigned by Sij are those appearing in q?. Since 9ij holds,
q? is false, so execution of Sij either truthifies q? or leaves it unchanged. Variable q? occurs only
positively in K, so truthifying q? never falsifies K.
T5: ?IJ receive ? fi A (?istok(ui) ? K[q? := false]) A (istok(?) ? tp = i A ?hi A te = tokval(?))J
Adding elements to Pi never falsifies J or K, and J and K do not depend on Ui or Vti, 50 J and
K are preserved by execution of this statement. We argued in Section 4 that the other conjuncts
in the postcondition hold after execution of this statement.
T6 : fi A tp = i A ?hi A tc = tokval(?)? (hi := trte ti := tokval(y?)) fI?
J is unaffected by execution of this statement because messages are neither sent nor received.
The proof that K is preserved is by case analysis on the disjunct of K that holds initially. Note
that the only variables or state functions appearing in K that are affected by execution of this
21
statement are tc and htp.
case K1: The first conjunct of K1 is unaffected by execution of this statement. We now consider
the second conjunct. If (Vk > i : Xi,k = ?), then, since tp = i appears in the precondition,
we can conclude that K1 holds after hj is set to true by this statement. If (Vk > i : ?i,k =
does not hold, then there exist k and m such that k > i and m E Xi,k I implies X,,i = ?, 50 it
must be that k > i and m E Xi,k. Erom the precondition of this triple, i = Ip, so i < tp < k.
Thus, by the third conjunct of J6as, lxi ? ts(m), so by J1, bj holds. Since tp = i and bj hold,
K2 must hold, so see that case.
case K2: K2 is unaffected by execution of this statement, so K2 still holds after execution of this
statement.
case K3: In this case, tc holds. Let m be the element of ?ok such that ts(m) = lxi+i. Execution
of this statement changes tc from tokval(?) to tokval(y?) v bj, 50 K3 is not falsified.
T7: ?I A K[q? := false]? Xt := ? (I A K[q? := false]?
J is unaffected by execution of this statement because messages are neither sent nor received.
Note that Xj can appear in K only in q?. Since K[q? := false] holds before execution, and since q?
occurs only positively in K, changing q? can't falsify K. Finally, K[qi false] is unaffected by
execution of this statement.
T8: ?I A Kki := false]? Hi ?I1
J is unaffected by execution of this statement because messages are neither sent nor received.
The only variables that appear in K and can be assigned by Hi are those appearing in q?. Since q?
occurs only positively in K, and since K holds even if q? doesn't (because K[qi := false] appears in
the precondition), execution of this statement cannot falsify K.
T9 : fi A qi A hj? (send mktok(ti V bi) to ? --H 1 hj := false bi := false? (I?
First, we show that execution of this statement changes tp from i to i --H 1. Since hi holds, we
conclude (using I) that tp = i. It follows from the definition of tp that (Vj # j + 1 : lrj ? lxi+i).
22
Since hi holds, I implies a1tok $ ? and a1tok = o1tok Let m be the element of a:ok with the
largest timestamp; thus, lxi+i = ts(m). Since altok = 0:0k, m ? p?, so (using I) is(m) ? Vii, i.e.,
lxi+i ? Vti. Thus, by transitivity of ?, (Vj $ i + 1 : lxi ? Vii). Since this statement does not
alfect lx? for j $ ?, after execution of this statement, (Vj ? ??, i + 1? : lx? ? Vii) holds. After
execution of this statement, lxi = inc(Vti, i). By definition of ?, Vii ? inc(vii, i), so by transitivity,
(Vj ? ??, i + ii: lxi ? vti) holds after execution. Since lxi+i ? Vti ? inc(vii, i), after execution,
lxi+i ? lxi holds. Thus, after execution, (Vj $ i : lxi ? lxi) holds, so by definition of ip, ip = i --H 1.
Ji is preserved because after execution of this statement, lxi is larger than the timestamps
of all messages previously sent by process j. To show that Jt0k is preserved, it suffices to show
lxi+1 ? inc(Vii, i), since lxi = inc(vii, i) after execution. Let m be the member of a1to+ok with the
largest timestamp (this is well-defined since hi and I imply that altok $ ? and that the timestamps
of messages in a,t0k are totally-ordered by ?); thus, lxi+i = ts(m). Since hi holds, we conclude
using I that a:ok = 01t?k, so m E Pi, which implies (using I) that is(m) ? Vii. By definition of ?,
Vii ? inc(Vti,i). Thus, lxi+i ? Vii ? inc(Vii,i).
Next we show that Jbas is preserved. Fix j, k, and m ? Xj,k (we have renamed the bound
variable i in (13) to j). We do a case analysis on the relative values of j, k, and ip.
case k < ip <j: Since Jbas holds, nlxi ? ts(m). ff ip $ k, then k < ip < j is preserved by
execution of this statement, so we must show nlxi ? is (m), which we already know to be
true. Suppose ip = k. After execution of this statement, ?(k < ip < j), so we must
show lx1 ? ts(m). We give a proof by contradiction: we suppose ?(lxi ? is(m)) and
show m E Ok, which contradicts the assumption m ? Xj,k I implies that the timestamps
generated by each process are totally ordered by ?, so is(m) ? lx1. Since ip = i, Jtok implies
lxi ? lxi?i?? ? lxi+i, 50 is(m) ? lxi+i. Let m' be the member of a:0k with the largest
timestamp (this is well-defined since hi and I imply that a1t,ok $ ? and that the timestamps
of messages in altok are totally-ordered by ?); thus, lxi+i = is(m'), so is(m) ? ts(rn'). Since
hi holds, we conclude (using I) altok --H 01tok, ?? (using I) m' E Pi, hence (again using I)
ts(m') ? Vii. Thus, is(rn) ? is(m') ? Vii, 50 (using I) m ? Pi. Since by assumption i =
m E 0k
case k <j and ?(k <?ip<j): SinceJ??? holds, lx? ? is(m). As in the previous case, preservation
of Jbaj is trivial if ip $ j. Suppose ip = j. After execution of this statement, k < tp <j, so
we must show that nlx1 ? is(m) then holds; this follows immediately from ?1 ? is(m) and
23
the fact that the value of fltXj after execution of this statement equals the value of ixj before
execution of this statement.
case j < tp < k: This case is analogous to the previous case.
case j < k and ?`(j < tp < k): This case is analogous to the first case.
Finally, we show that K is preserved by execution of this statement. Recall that execution of
this statement changes Ip from i to i --H 1. Note that execution of this statement leaves tc unchanged.
The proof that K is preserved is by case analysis on the disjunct of K that holds initially.
case K1: We distinguish two subcases.
case (Vk: Xi,k = ?): From the precondition of this triple, q? holds. Since execution of this
statement does not affect q? or Xi,k for all k, K1 continues to hold after execution of this
statement.
case (?k: Xi,k # ?): Since K1 and htp hold, (Vk > i : = ?) does too. This, together
with the assumption (?k; Xi,k # ?), implies there exists k such that k <i and Xi,k # ?
Let m be an element of Xt,k Since k < i and tp = i, J6as implies irt ? ts(m), from
which we conclude using J1 that bj holds. After execution of this statement, tc equals
tj V bj, 50 K3 then holds.
case K2: Since i = tp, K2 = (?k < i : bk) v bj. If the left disjunct holds, then K2 still holds
after execution of this statement. If the right disjunct holds before execution, then so does
K3 (because h1 holds and tp = i), so see that case.
case K3: tc is unchanged by execution of this statement, so K3 still holds after execution of this
statement.
A.2 Proof for Process 0 in Isolation
The verification of process i when i = 0 in isolation involves the following triples, in addition to
those discussed above.
T10: ?I A ?h0 A tp > 0 A (Vj %t.ok = ?)) send mktok(fatse) to N --H 1 fi A ?h0 A tp >
First, we show that after execution of this statement, tp = N --H 1. The precondition implies
(Vj : ?m E Uk a?ftk sender(m) = = ?; it follows from the definition of ixj that lx> = 6 for all
24
j. After execution of this statement, 1xo = inc(vto, 0). From the definition of ?, 0 ? inc(vt, 0) for
all vector times vt. From the definition of ?p, we conclude that after execution of this statement,
(Vj ? 0 : lxj ? ixo) holds, hence tp = N --H 1.
J1 is preserved because after execution of this statement, ixo is larger than the timestamps of
all messages previously sent by process 0. To show that Jt0k holds after execution of this statement,
we need to show that 0 ? 0 and 0 ? inc(vto, 0); both of these facts follow from the definition of
?. To see that J6a5 holds after execution of this statement, note that iXj = 0 and (by the same
reasoning) nlxj = 0 for j # 0. Thus, J6as holds trivially for j # 0. For j = 0, note that there is no
process k such that k <0, and recall that after execution of this statement, tp = N --H 1. Thus, the
only non-vacuous conjunct in Jbas is the bottom one. This conjunct holds because nixo = 0.
The conjunct tp > 0 in the postcondition holds after execution because tp then equals N --H 1,
as shown above. Finally, note that ?h0 is unaffected by execution of this statement.
Til : ?I A ?h0 A tp > 0? Inito ?I?
Validity of this triple follows by the same reasoning as for triple Ti.
T12 : ?I A h0) (send mktok(false) to N --H 1 h0 := false b0 := false') ?IY
J is preserved by the same reasoning as for triple T9. We now show that execution of this
statement truthifies K1. Since h0 holds, we conclude (using I) that tp = 0 holds before execution
of this statement, so ?hN--H1, because otherwise, I implies tp = N --H 1, which contradicts tp = 0.
By the same reasoning as for triple T9, after execution of this statement, tp = N --H 1. Thus, K1
holds vacuously after execution of this statement.
Finally, we discuss one proof obligation that arises when using the foregoing results to verify
the proof outlines given in Figure 1. When proving the second branch of RELAY0, the following
subgoal arises:
I A qo A h0 A ?(t0 v b0) ? Q
We assume the antecedent and prove the consequent. First, we show that K1 must hold, by showing
that K2 and K3 do not. Since h0 holds, we conclude (using I) that tp = 0. From tp = 0 and ?b0,
we conclude that K2 does not hold. From h0 and ?(t0 v b0), we conclude that K3 does not hold.
Thus, assuming the antecedent holds, K1 also holds. It is easy to show that K1 and the antecedent
25
together imply Q.
A.3 Interference Freedom
Most of the intefference freedom obligations can be discharged easily, using derived rules such as
Interference Ereedom for Synchronously Altered Assertions [LG8l]. One non-trivial triple that
arises in the proof of interference freedom is
false] A K A K[q?			false]) Ri			false])
where j # i. By the Assignment Axiom, validity of this triple follows from
K[q?			false] A K A K[q?			false] ? K[q?			false, q?			false]
We assume the antecedent and prove the consequent. If K2 holds, then K2[q? false, q? false]
holds, since q? and q> do not appear in K2. The same reasoning applies to K3. If neither K2 nor
K3 hold, then Ki[q? false] AK1 A K1[q1 false] must hold. We show by contradiction that this
implies i < tp. Suppose i > tp; then
K1 = q1 A (Vk: Xi,k =
A(Vi' > tp: ii # i ? q1, A (Vk: Xi'* =
A(htp? (Vk > tp: =
so Ki[q? false] = false A o+., so K1[q? false] does not hold, which contradicts the assumption
above. Thus, i < tp. Analogous reasoning shows that j < tp. Since i < ip and j < tp, K1 is
independent of q? and q? By assumption, K1 holds, so Ki[q2 false, q? false] also holds.
26
