Time, Clocks, and the Ordering of Events in a Distributed System
Notes by Ralph Benzinger. Previous version by Jia Wang.
Basics
- A process is an ordered sequence of events
- A distributed system is a collection of processes that communicate by
exchanging messages such that message transmission delay is not negligible
- Goal: impose "x happens before y" order on events in the system
Partial Ordering of Events
- An event a happens before an event b (written "a®b")
if
- a and b are events in the same process and a occurs before b, or
- a is the event of sending a message m and b is the event of receiving this message m, or
- there exists an event c such that a®c and c®b
- Intuition for a®b: event a may causally affect event
b
- Two distinct events a and b are concurrent iff neither a®b
nor b®a holds
- Relation ® is an unique irreflexive partial ordering on the
set of all events in the system
Clocks
- A logical clock C is a mapping from events to integers
- Clock condition: if a®b, then C(a)<C(b) for all
events a, b
- Implementation of clocks that satisfies clock condition:
- Each process P increments its own clock C between any two successive events
- Any message sent contains a time stamp of this event, and any message received sets the
local clock to a value greater than or equal to its present value and greater than the
time stamp in the message
- Total ordering of processes («) induces a total ordering of events: "aÞb" iff
- C(a)<C(b), or
- C(a)=C(b) and P(a)«P(b)
- a®b implies aÞb
Example: Resource Allocation
- Processes compete for single resource that
- cannot be shared and
- must be granted in first-come, first-served order such that
- every request is eventually granted if only every process eventually releases the
resource
- Centralized scheduling fails in the absence of global clocks due to condition (2)
- Solution: distributed algorithm to establish total ordering of events
- Processes negotiate resource allocation by sending time stamped request/release messages
to each other
- Time stamps guarantee synchronization
- Punchline: Don't do anything at global time t unless all other processes have advanced
to times t'>t
- Casual relationships outside the system might result in "anomalous"
behavior
Physical Clocks
- Newtonian time is for sissies, real men use relativistic space-time
- Physical clocks run at slightly different speeds dC(t)/dt < 1+k and need
frequent re-synchronization
- Anomalous behavior is impossible if e/(1-k) is less then the shortest transmission time
for interprocess messages, where e is the maximum deviation of two clocks
- Adapted distributed algorithm synchronizes physical clocks (supposedly used by xntp
daemon)
Brain Teasers
- Why does the centralized scheduling algorithm fail in the resource allocation example?
- How efficient is the distributed synchronization algorithm?
- What happens if message delivery is unreliable?