CS5414 (Fall 2012) Homework 3
More on Causality and Clock Implementations

Due: 11pm, 9/25/2012
Relative weight: 12 units

General Instructions. You are expected to work alone on this assignment.

Submit your solution using CMS. No late assignments will be accepted.

To facilitate grading, format your solutions as follows.

Solutions that do not satisfy the formatting guidelines will be returned, ungraded.


Problem 1

We are given a middleware system MS that implements vector clocks. The middleware implementation has some predefined notion of what are events, which include all sends, receives, and local events.. Assuming no more than N distinct hosts are running, the middleware call
    GET_VecC
made by processor Pi returns in the elements of an array
    V[1:N] : integer
a vector clock value < v1 , v2 , v3 , ..., vN > corresponding to the "time" the last event at processor Pi occurred according to the vector clock being implemented by MS.

Given is a software package SP that requires logical clock values and, therefore, SP cannot directly be run above MS (which "only" provides vector clock values). In particular, SP requires the existence of a routine

    GET_LogC
which returns an integer corresponding to the "time" the last event at that processor occurred according to a logical clock. The events that affect GET_LogC are the same as the events that affect MS,

  1. Give an implementation of GET_LogC that does not involve sending additional messages, modifying messages being sent, or recording local information. Your implementation, however, may invoke GET_VecC.

  2. Argue that your implementation of logical clocks satisfies the requirement of a logical clock:
      E --> F implies LC(E) < LC(F)


Problem 2

Recall that a vector clock VC maps each event E to a timestamp VC(E) and satisfies the following strong clock condition:
    E --> F if and only if VC(E) < VC(F)
Vector clocks are not the only method for implementing a mechanism that satisfies the strong clock condition, though.

A prime clock PC is constructed as follows. Assume that every processor i is assigned and knows a unique prime number Ni. Assume also that each processor has a way to store integers that can grow arbitrarily large.

    1. Each processor i maintains a counter PCi, which is initially set to Ni.

    2. Whenever a local event E occurs at processor i, the following is executed:
        PCi := PCi * Ni
      The timestamp PC(E) of E is the value of PCi after the update.

    3. Whenever a send event E for a message m occurs at processor i, the following is executed:
        PCi := PCi * Ni;
        timestamp TS(m) = PCi is included as part of m
      The timestamp PC(E) of E is the value of PCi after the update.

    4. Whenever a receive event for a message m occurs at processor i, the following is executed
        PCi := LCM ( PCi , TS(m)) * Ni
      where LCM(x,y) is the "least common multiple" of integers x and y. (Here is a formal definition for "least common multiple".) The timestamp PC(E) of E is the value of PCi after the update.

Here is a useful fact: The "Fundamental Theorem of Arithmetic" asserts that every positive integer greater than 1 has a unique factorization in terms of prime numbers. That is, for every number Val, there exist exponents e0, e1, e2, ... such that

    Val = (N0**e0) * (N1**e1) * (N2**e2) *...
where N0, N1, ... Ni is the list of prime numbers.

  1. Give an implementation for a relation "<" on values from prime clocks such that values obtained from prime clocks satisfy the strong clock condition. That is, give a procedure for computing whether PC(E) < PC(F) holds, such that
      E --> F if and only if PC(E) < PC(F)
    is valid.

  2. Prove that prime clocks satisfy the following condition:
      PC(E) < PC(F) implies E --> F .
    where "<" is as defined above in part i.


Problem 3

A cloud-based file system service CBFS is said to support disconnected operation if it has provisions that allow a client to:
  1. using the Internet, connect to CBFS using some computing device D (say),
  2. check-out and download a file (say) F to device D,
  3. disconnect from the Internet (including from CBFS),
  4. read and edit F on device D,
  5. later reconnect device D to CBFS,
  6. attempt to check-in the modified file F.
The "check-out" in step (2) never blocks, so it is possible that multiple users concurrently perform a check-out on the same given file F. And any device that has a copy of F is permitted to perform edit operations on F.

The check-in operation either returns a

    "file F stored"
message or a
    "file F not stored, reconciliation required with users ui, uj, uk, ..."
message. This latter message is produced only if the copy of file F stored in the cloud has changed (due to an earlier check-in); ui, uj, uk, is the list of users to contact because they made edits and did a check-in to cause the copy of F currently stored in the cloud to differ from the (edited) version of F associated with this new check-in.

Assume a fixed set of users, where each user owns and runs a fixed set of devices that can connect to CBFS to download, edit, and later check-in an edited file. Users do not share devices. The real-time clocks on these various mobile devices are not synchronized, and it is unreasonable to imagine that they could be. Describe a scheme that CBFS and clients can use to implement check-out and check-in.