CS514 Spring 2008

Assignment #2

Due: March 7, 2008

Objective

Throughout this course, you will learn several techniques related to data and service replication: virtual synchrony, state machines, agreement protocols, all of which rely on some form of multicast to enable a group of distributed software components to take actions in a consistent manner. To learn about these techniques in class, however, is not the same as being able to use them in practice to build real systems, and we find that many students who excelled in the course found themselves a little bit confused at first when faced with the task of building a distributed application that relies on some form of replication, not because they didn't understand the intricacies of the underlying protocols, but because they lacked the intuition and hands-on experience.

The purpose of this assignment is to help you build such intuition, and give you a practical context for what you are going to learn in the second half of the course. You will use the QuickSilver Live Objects framework we developed to implement a distributed component that has its "state" replicated across multiple machines, presents the component to users on any of these machines in the same, consistent manner, and allows any of these users to perform "actions" against the component.

You can see several examples of such components in the demo. Shared text notes and shared desktops are both examples of what we might call "shared documents". These "documents" display the same "content" to multiple users, and allow anyone to edit them (change the text of the note, paste new objects onto the desktop, move or resize objects on the desktop). Here the "state" that is replicated is the document itself (the text message, the list of objects on the desktop with their sizes, labels, and positions), and the "actions" that users initiate against the document are the edits. Similarly, you see different versions of "shared folders", with a regular filesystem interface, and with a 3-dimensional graphical interface. Here, the "state" of a folder is the list of objects it contains, and the "actions" taken against it are adding objects to the folder, removing them etc.

We will ask you to build a component similar to those above, a shared document with a visual user interface that multiple users can view or edit. While we could consider other types of components, focusing the attention on a shared document has the benefit that it does not require sophisticated GUI, and so you won't waste time implementing it. A shared document is a nice project to start with, in that in this project, we would like you to tackle situations where multiple users simultaneously initiate actions against the component, and with a shared document, it is really easy to arrange for such situations to occur and really easy to see whether the different replicas on the document go out of sync or have their state corrupted, which makes it convenient for debugging.

Internally, your component will use a totally ordered, reliable multicast protocol to coordinate its replicas that reside on different machines. You don't need to implement multicast, we will provide a simple multicast channel for you to use. Your focus in this assignment will be to understand how to use a reliable, totally ordered multicast primitive, not how to build one from scratch. Accordingly, you should invest all your energy and creativeness thinking of the ways you can leverage multicast in your application. It is essential to this assignment that you free your mind from thinking at the level of sockets and packets, hence you should not waste time implementing your own homebrew version of multicast. What matters in this assignment is a smart application logic that uses existing software, not low-level hacking skills.

In order to get you started and help you focus on the cool aspects of the assignment, we will provide a tutorial of how to implement a very simple version of such an object. Rather than simply copy and paste the tutorial with minimal changes, you should think of the ways in which you can extend and build upon what the tutorial describes.

Details

If you look at a shared text note object in the demo on the main live objects website, it is simplistic in several ways. When you edit the text note, the note is not "locked" in any way, and if multiple users try to simultaneously edit it, changes made by some of them will be overridden by changes made by others. For example, suppose the note contains text "Alice loves Bob.". Now, Alice opens the note, selects the word "loves", and types in "hates" instead. But supposed that at the exact same time, Bob opens the note, and appends the text ", and Bob loves Alice, too.". Now, the way this simple text note is implemented is that when the user stops typing, the entire updated text, as it is, is multicast to other users. When Alice and Bob make their update simultaneously, without being aware of each other's changes, one of their updates will be delivered sooner than the other, and will override the other. As a result, we will end up either with "Alice hates Bob.", or "Alice loves Bob, and Bob loves Alice.", instead of the expected "Alice hates Bob, and Bob loves Alice" if both updates were applied.

In reality, these sorts of misunderstandings can cause much trouble, hence we need a way to prevent them. To this purpose, your document should use messages that allow users to "lock" portions of the document, and prevent other users from editing those portions at the same time. For example, when Alice selects the word "loves", you could imagine everyone seeing the word "loves" being selected, and when they try to write text over the word, they would find that it cannot be modified. Only when the Alice stops editing the word and submits her changes would the word be "unlocked". At the same time, users could "lock" other portions of the document, thereby gaining, for a period of time, an exclusive right to modify those portions. You should consider the situation when users unexpectedly leave and never unlock a portion of the document they were editing. To deal with those situations, you might need to make your "locks" be some sort of leases that expire after a short period of time, perhaps half a minute, and have users that spend a really lot of time editing a given portion of the document perhaps "refresh" those leases periodically, and automatically lose the right to edit the given portion if they fail to refresh their lease in time. Implementing this sort of functionality might require you to send additional types of messages and have a bit more fancy logic to determine what the user is allowed to edit. Another idea would be to allow users to forcefully "unlock" portions locked by others. Any scheme that works in practice will do, although obviously, the more sophisticated it is, the better.

The simplest version of the assignment would do just that: implement a simple shared document of any kind, even with a single paragraph of unformatted plain text, and a simple practical locking scheme that allows users to edit different potions of the text and prevent users from overriding each other's updates.

There are many ways in which you could make this assignment more interesting: the text could be formatted into multiple paragraphs, offer a few fonts and colors, or even support images. When a section is locked by some user, it might be somehow grayed out, and if a mouse is hovering over that section, it might show the identity of the user editing it (presumably, the user would include his identity in the message used to lock the text). A perhaps slightly more challenging, but still not extremely difficult thing would be, to ensure that if somebody is reading page 10 of the document, while somebody else pastes or removes 5 pages of text at the beginning, the person reading won't see the text constantly jumping back and forth in front of her/his eyes, and can still focus on reading. You might need to give it a bit more of a thought as to how to intelligently synchronize the user's local actions (scrolling over the text) with changes performed in a distributed manner (edits made by other users). Another extension would be to allow the document to be a mash-up that contains other visual elements besides just text. For example, it could allow images, or even live objects to be pasted into it. Yet another interesting twist on this assignment would be to make it a distributed source code editor that allows multiple users to edit different sections of the same source file. You could even sugar it with syntax coloring or matching parentheses to make it look cool, though obviously, this should rather be the last thing you do, if you have spare time on your hands after solving all issues mentioned above. Finally, if you are some sort of an ultimate über hacker, and do everything we have just described in no time, we challenge you to make the document fault-tolerant, by ensuring that before all users close the document, a replica of it is created on some backup server. But to do this really well and efficiently would involve a substantial programming effort, and thus it is far beyond what this simple assignment expects from you, so don't reach too far ahead and invest your time wisely.

In general, we will appreciate any hard work put into the assignments, and we will value both neat visual design and interesting algorithmic aspects of the solution, though we'll value the latter more. In some sense, a shared document might be the easiest to build, since you would not need to spend a lot of time hacking the GUI, and you would have more time to focus on the internal logic.

What to hand in

You need to deliver three items:

A zipped folder containing a Visual Studio solution with a working code that produces a working DLL. We will not run every single submission, but we will do a random check that your code compiles, works, and does what you declared it does. If we find a problem, we will give you a chance to resolve it, but it it turns out to be more than a trivial typo or a trivial issue with installation or setup, it could become a problem. You are expected to verify your deliverables yourself, not just on the machine on which you developed it, but also on your mom's laptop that doesn't have Visual Studio and that you haven't spend two hours setting up (needless to say, your code shouldn't require complex setup and everything that needs to be done to set it up needs to be thoroughly documented).
A document in PDF that contains selected important pieces of the code, for easy reading, especially those that do something nontrivial, such as any sort of synchronization logic. The code should be reasonably commented (not every line and not every set/get method, but perhaps every major block of code that does something that isn't immediately obvious). There should be a few lines of explanation that helps the reader understand the structure of the document he's looking at (such which major pieces of code are included). It would be nice if the document has some kind of a structure with different sections rather than being a 20-page novel.
A document in HTML, or a set of documents in the form of a mini-website, that contains an article describing your system. Articles on websites such as CodeProject are a good model of what you should shoot for. It doesn't have to be long, and shouldn't be long. It should look reasonably structured, and it should contain the following key elements:
- A brief high-level description of what your component does, and what interfaces (graphical, or programmatic if it has any) it provides for a person who would like to just use it.
- A checklist of all cool features it has that required a significant programming effort, or anything you invested time in and want to brag about. Think of it as something that would determine your grade to a large extent. If you did something challenging and it isn't listed here, you may not be rewarded for it, because we may end up not reading your code, so make sure you list it!
- A separate section for each feature outlining your solution. If the feature is cool UI or a cute animation, you don't need to write anything more besides just listing it (perhaps including a screenshot, if adequate). On the other hand, if the feature is a lease-based locking mechanism, or some intelligent method of synchronizing actions performed against the object, you should write a few sentences about it: why it was nontrivial, what were the major challenges, what were the cool ideas you had, and, importantly, a short argument or a proof of why they work. For the latter, we are not looking for a formal mathematical proof, just a few sentences of intuition and handwaving, but you do need to convince us that you know what you're talking about.

We won't publish your work anywhere, but we are asking you to do your write up as if it were going to be posted on some developer forum, and we strongly encourage you to post your solution online. Besides having something concrete to point to from your resume, the prospect of your work being judged by others will give you a string additional motivation to do it really well, and your grade will probably reflect that.

How to turn in your assignment

Please don’t email your submission to us. Instead, upload your files, appropriately named, onto CMS. You need to do so irrespectively of whether you will post your work anywhere else.

How we'll grade the assignment

We will be grading using roughly the same criteria as were described for Assignment 1. As in that case, we may ask for demos of some assignments, picking victims at random and including additional requests in cases where something puzzled us when we reviewed the write-up and code. You will be graded on a curve. Note that this assignment doesn't have an optional part, and there is no particular requirement to earn an M.Eng. credit.

Hints

Within a week or so, we will update the documentation on the main live object's website with a tutorial detailing the process of implementing a simple shared document object. Beyond that, we may schedule additional lectures per request to explain the difficult parts of the assignment. If you find some part of the assignment particularly puzzling, you should let us know.