Fall 2009: CS5410 Fault-tolerant Distributed Computer Systems

Phase IV: Availability Through Active Replication

Due: 11:59pm Monday, 11/23/2009

General Instructions. Students are required to work together in teams. You may work in a team you used for an earlier phase or you may form a new team. An assignment submitted on behalf of a "team" having fewer than 2 or more than 5 students will receive a grade of F. All members of the team are responsible for understanding the entire assignment.

This assignment expects you to extend the code for some Phase I implementation. Feel free to use code or ideas from any team's solution to any prior phases in designing those extensions. However, submissions that contain extraneous code (such as code needed for implementing earlier phases but not needed in this phase) will be penalized.

No late assignments will be accepted.

Academic Integrity. Collaboration between groups is prohibited and will be treated as a violation of the University's academic integrity code.

Background: Enhanced Availability and Fault-tolerance

In the distributed banking system you built for Phase III, bank accounts (and the funds they store) are unavailable from the instant that the primary server fails until clients of the service have been informed of the new primary's identity. Moreover, the primary/backup approach cannot tolerate Byzantine failures. Active replication (also known as the "state machine approach"), though a bit more expensive, does not exhibit these limitations.

In this phase of our CS514 project, you will employ active-replication, replacing each branch server by a service with equivalent semantics but exhibiting higher availability. This phase will thus give you an opportunity to get hands-on experience designing a service that implements active replication.

What to Build

For this phase, you are given considerable latitude concerning assumptions you make about the environment and protocols you employ in implementing your system. At a minimum, however, assume a computing environment in which:

Each branch GUI runs on its own host, and this host never fails. A branch GUI:
- may send messages to any individual host or any subset of the system hosts based on user input to the GUI,
- may receive messages from any system host, and
- may embody a voter to accept or reject messages containing information for display by the GUI.
The branch GUI otherwise has extremely limited processing capacity and may not otherwise be programmed as if it were a general-purpose host.
Other software components (servers etc.) run on hosts that can fail. There is one of these hosts per branch. Associated with each of these hosts is a GUI that never fails and that can be used to cause the associated host to simulate unusual behavior (such as failure or restart)---all programs being executed on the associated host are then affected.
Every host appears to be directly connected to every other host. Thus, a correct host that receives a message can determine with certainty which host (correct or faulty) sent that message.

As in previous phases, feel free to employ a single real computer in order to simulate all of the hosts and the network being used by your system.

In addition to the above assumptions, define the other aspects of your computing environment by making choices for the following computing environment characteristics.

Failure model: Choose between Byzantine, send-omissions, crash, and fail-stop.
Fault-tolerance degree: Define the worst case number of host failures that may occur before the functionality of any branch could suffer.
Synchronicity assumption: Choose between asynchronous, synchronous, and synchronous with clocks that are being kept synchronized (so you don't have to bother).
Replica management protocols: Choose among any of the protocols discussed in class or that you find in the literature. Feel free to modify any of these existing protocols to your particular needs, but do not design an agreement protocol or a replica-mangement protocol from scratch.
Recovery/Restart functionality: Implement the functionality to reintegrate failed hosts that have been repaired or elect not to support this functionality.

Comments about State Machine Protocols. A branch server can be replicated by running a replica of it on hosts at other branches. There are two ways to support a replica of branch server for branch B1 (say) on the host for branch B2 (say):

Modify the branch server at branch B2. The modified branch server not only processes the operations (commands) that it used to (e.g. Deposit, Withdraw, Query, Transfer for that branch's accounts) but this modified branch server also handles these operations for branch B1. Effectively, we are merging new branch server replicas with the branch server that already exists at each host.
Instantiate at branch B2 a copy of the branch server for branch B1, and associate this copy with a new socket at the host running branch server B2. Note that you may have to modify your "network wrapper class" to accommodate any additional sockets now in use.

As far as the state machine approach is concerned, each branch GUI should be viewed as a client. This means that the branch GUI will now be receiving multiple responses for each operation it initiates and, therefore, it must convert those responses into a single one (which is then presented to the human user). Because the mechanism for combining responses resides in the branch GUI, it never experiences failures.

Recall that in processing a Transfer operation, one branch server invokes an operation at another. Think carefully about how best to handle this once branch servers are replicated. Although there is only one level of nested call by servers in the system, try to devise a solution that scales up, gracefully supporting multiple levels of nested calls. The naive solution---having each server replica make a separate request to each other server replica---is workable, but more-scalable solutions will receive higher grades.

Submission Procedure. All submissions should be made through CMS. CMS provides a way for you to define your group. Be advised that each group member must take an action in creating a group, and your group cannot submit anything through CMS until the group has been created.

Submit the following files:

TEAM a .txt file that contains the names (and net-ids) for all team members. Also, for each team member give a 1 or 2 paragraph description of the tasks this team member performed and the number of hours this required.

README a .txt file that contains

Instructions for installing, compiling, and running your software on our Windows system.
A tutorial that the grader can follow to start your software and to convince himself that your system implements the required functionality. Expect the grader to spend at most 10 minutes on this task.

LOGIC a .txt file that contains

A description of the computing environment characteristics being assumed. If your choices involve assumptions that must be simulated (e.g., you assume clocks are being kept synchronized or you assume a failure-detection service), explain how this simulation is implemented.
A description of the replica management protocols (including agreement protocols) you implemented. Include citations (or URL's) to publications where you found these protocols.
A description of that protocol, if you elected to have your system support host recovery/restart functionality. Include citations (or URL's) to publications that contain these protocols.

TestPlan a .txt file that describes the process and any tools (i.e. additional programs) you wrote in order to test your system. This file should also explain what tests you ran and why this was a reasonable set of tests to have run.

SourceCode A zip file containing the sources needed to compile and test your system.

Grading. Your grade will be based on the above documentation and the following elements:

Does your system run.
How hostile is the computing environment you target? More hostile environments lead to better grades only if the system runs. Thus, you are strongly encouraged to choose a computing environment initially for which the chances of your getting the system to run are reasonably good; then, if you have more time, consider extending your system with additional functionality (e.g., add recovery/restart functionality) or to run in a more hostile environment.
How easy is it to follow the README file installation and sample-execution script.
How thorough was your testing procedure and how creative were you in building sufficient scaffolding (computing environment simulation, test drivers, etc.) to test your system.
Is the source code easy to understand and does it exhibit good structure?