Throughout this course, you will learn several techniques related to data and service replication: virtual synchrony, state machines, agreement protocols, all of which rely on some form of multicast to enable a group of distributed software components to take actions in a consistent manner. To learn about these techniques in class, however, is not the same as being able to use them in practice to build real systems, and we find that many students who excelled in the course found themselves a little bit confused at first when faced with the task of building a distributed application that relies on some form of replication, not because they didn't understand the intricacies of the underlying protocols, but because they lacked the intuition and hands-on experience.
The purpose of this assignment is to help you build such intuition, and give you a practical context for what you are going to learn in the second half of the course. You will use the QuickSilver Live Objects framework we developed to implement a distributed component that has its "state" replicated across multiple machines, presents the component to users on any of these machines in the same, consistent manner, and allows any of these users to perform "actions" against the component.
You can see several examples of such components in the demo. Shared text notes and shared desktops are both examples of what we might call "shared documents". These "documents" display the same "content" to multiple users, and allow anyone to edit them (change the text of the note, paste new objects onto the desktop, move or resize objects on the desktop). Here the "state" that is replicated is the document itself (the text message, the list of objects on the desktop with their sizes, labels, and positions), and the "actions" that users initiate against the document are the edits. Similarly, you see different versions of "shared folders", with a regular filesystem interface, and with a 3-dimensional graphical interface. Here, the "state" of a folder is the list of objects it contains, and the "actions" taken against it are adding objects to the folder, removing them etc.
We will ask you to build a component similar to those above, a shared document with a visual user interface that multiple users can view or edit. While we could consider other types of components, focusing the attention on a shared document has the benefit that it does not require sophisticated GUI, and so you won't waste time implementing it. A shared document is a nice project to start with, in that in this project, we would like you to tackle situations where multiple users simultaneously initiate actions against the component, and with a shared document, it is really easy to arrange for such situations to occur and really easy to see whether the different replicas on the document go out of sync or have their state corrupted, which makes it convenient for debugging.
Internally, your component will use a totally ordered, reliable multicast protocol to coordinate its replicas that reside on different machines. You don't need to implement multicast, we will provide a simple multicast channel for you to use. Your focus in this assignment will be to understand how to use a reliable, totally ordered multicast primitive, not how to build one from scratch. Accordingly, you should invest all your energy and creativeness thinking of the ways you can leverage multicast in your application. It is essential to this assignment that you free your mind from thinking at the level of sockets and packets, hence you should not waste time implementing your own homebrew version of multicast. What matters in this assignment is a smart application logic that uses existing software, not low-level hacking skills.
In order to get you started and help you focus on the cool aspects of the assignment, we will provide a tutorial of how to implement a very simple version of such an object. Rather than simply copy and paste the tutorial with minimal changes, you should think of the ways in which you can extend and build upon what the tutorial describes.
If you look at a shared text note object in the demo on the
main live objects website, it is simplistic in several
ways. When you edit the text note, the note is not "locked" in any way, and if
multiple users try to simultaneously edit it, changes made by some of them
will be overridden by changes made by others. For example, suppose the note
contains text "Alice loves Bob.". Now, Alice opens the note, selects
the word "loves", and types in "hates" instead. But supposed that at the
exact same time, Bob opens the note, and appends the text ", and Bob
loves Alice, too.". Now, the way this simple text note is implemented is
that when the user stops typing, the entire updated text, as it is, is multicast to
other users. When Alice and Bob make their update simultaneously, without
being aware of each other's changes, one of their updates will be delivered
sooner than the other, and will override the other. As a result, we will end
up either with "Alice hates Bob.", or "Alice loves Bob, and
Bob loves Alice.", instead of the expected "Alice hates Bob,
and Bob loves Alice" if both updates were applied.
In reality, these sorts of misunderstandings can cause much trouble, hence
we need a way to prevent them. To this purpose, your document should use
messages that allow users to "lock" portions of the document, and prevent
other users from editing those portions at the same time. For example, when
Alice selects the word "loves", you could imagine everyone seeing the word
"loves" being selected, and when they try to write text over the word, they
would find that it cannot be modified. Only when the Alice stops editing the
word and submits her changes would the word be "unlocked". At the same time,
users could "lock" other portions of the document, thereby gaining, for a
period of time, an exclusive right to modify those portions. You should
consider the situation when users unexpectedly leave and never unlock a
portion of the document they were editing. To deal with those situations,
you might need to make your "locks" be some sort of leases that expire after
a short period of time, perhaps half a minute, and have users that spend a
really lot of time editing a given portion of the document perhaps "refresh"
those leases periodically, and automatically lose the right to edit the
given portion if they fail to refresh their lease in time. Implementing this
sort of functionality might require you to send additional types of messages
and have a bit more fancy logic to determine what the user is allowed to
edit. Another idea would be to allow users to forcefully "unlock" portions
locked by others. Any scheme that works in practice will do, although
obviously, the more sophisticated it is, the better.
The simplest version of the assignment would do just that: implement a
simple shared document of any kind, even with a single paragraph of
unformatted plain text, and a simple practical locking scheme that allows
users to edit different potions of the text and prevent users from
overriding each other's updates.
There are many ways in which you could make this assignment more interesting: the text could be formatted
into multiple paragraphs, offer a few fonts and colors, or even support
images. When a section is locked by some user, it might be somehow grayed
out, and if a mouse is hovering over that section, it might show the
identity of the user editing it (presumably, the user would include his
identity in the message used to lock the text).
A perhaps slightly more challenging, but still not extremely difficult thing
would be, to ensure that if somebody is reading page 10 of the document,
while somebody else pastes or removes 5 pages of text at the beginning, the
person reading won't see the text constantly jumping back and forth in front
of her/his eyes, and can still focus on reading. You might need to give it a
bit more of a thought as to how to intelligently synchronize the user's
local actions (scrolling over the text) with changes performed in a
distributed manner (edits made by other users).
Another extension would be to allow the
document to be a mash-up that contains other visual elements besides just
text. For example, it could allow images, or even live objects to be pasted
into it.
Yet another interesting twist on this assignment would be to make it a
distributed source code editor that allows multiple users to edit different
sections of the same source file. You could even sugar it with syntax
coloring or matching parentheses to make it look cool, though obviously,
this should rather be the last thing you do, if you have spare time on your
hands after solving all issues mentioned above. Finally, if you are some
sort of an ultimate über hacker, and do everything we have just described in no
time, we challenge you to make the document fault-tolerant, by ensuring that
before all users close the document, a replica of it is created on some backup
server. But to do this really well and efficiently would involve a substantial
programming effort, and thus it is far beyond what this simple assignment
expects from you, so don't reach too far ahead and invest your time wisely.
In general, we will appreciate any hard work put into the assignments, and we will value both neat visual design and interesting algorithmic aspects of the solution, though we'll value the latter more. In some sense, a shared document might be the easiest to build, since you would not need to spend a lot of time hacking the GUI, and you would have more time to focus on the internal logic.
You need to deliver three items:
We won't publish your work anywhere, but we are asking you to do your write up as if it were going to be posted on some developer forum, and we strongly encourage you to post your solution online. Besides having something concrete to point to from your resume, the prospect of your work being judged by others will give you a string additional motivation to do it really well, and your grade will probably reflect that.
Please don’t email your submission to us. Instead, upload your files, appropriately named, onto CMS. You need to do so irrespectively of whether you will post your work anywhere else.
We will be grading using roughly the same criteria as were described for Assignment 1. As in that case, we may ask for demos of some assignments, picking victims at random and including additional requests in cases where something puzzled us when we reviewed the write-up and code. You will be graded on a curve. Note that this assignment doesn't have an optional part, and there is no particular requirement to earn an M.Eng. credit.
Within a week or so, we will update the documentation on the main live object's website with a tutorial detailing the process of implementing a simple shared document object. Beyond that, we may schedule additional lectures per request to explain the difficult parts of the assignment. If you find some part of the assignment particularly puzzling, you should let us know.