Experience with Grapevine:
The Growth of a Distributed System
Paper by: Michael Schroeder, Andrew Birrel, and Roger Needham.  Summary by: Jim Curcio
 

Grapevine is a distributed, replicated system that is mainly used for mail systems.  Service is divided over a network of several computers.  For reliability, some information is replicated or copied in two or more places over the network.

There are two cooperative types of service which Grapevine provides via a set of servers for each type.  Message service controls the delivery of messages from clients to other individuals and distribution lists.  Delivery paths are replicated which means in the Grapevine context that at least two message servers accept messages for a given user.  Registration service maintains information about the individuals and resources of which the system is comprised.  It provides the information by which the message service is run such as resource location, access control,and authentication data.

Scaling design is to increase system capacity by adding more servers of fixed power rather than by replacing existing servers with more powerful servers.  Scaling computation is made easy by the fact that servers have fixed power.  If you compute the total load one server can carry and the total load over the network, the number of network servers needed can be computed by a simple division.  Growing community of users is met by adding registries rather than by enlarging existing ones.  Since the number of copies of registration data is independent of the number of servers and users, the amount of data on one registration server is not bound to the size of the system.  Problems include sending messages to distribution lists and the obvious problem of the network becoming more sparsely connected with added fixed-power nodes.

For any given message server, a message is stored one time and shared among all the recipients of the message.  Three inboxes are stored for each user in the configuration of the authors' system.  Primary inbox is near the user.  To prevent message server overloading but maintain closeness of inboxes to registries, secondary and tertiary inboxes are placed arbitrarily nearby and far away respectively.  It is important that corresponding primary inboxes and registries be on nearby servers because the two communicate often.

Transparency means that the distributed, replicated nature of the operating system is hidden from the users.  In general, transparency is obtained but problems occur when making the system consistent after message update of user registries and messages are duplicated because of the organization of the system.

To adjust to increased load, the authors had to change the much used user add/delete update mechanism to be incremental instead of copying entire lists of data.  Access control and authentication algorithms had to be changed as well.

Major contribution of this paper is that the administrators become part of the operating system and their roles are outlined.  The set-up of the operating system is discussed and the authors try to suggest various ways of optimized this set up.  The assignment of registries to organizational constitutents instead of geographical location is one example of the use guidelines that the authors provide.  Paying attention to the user organization, especially in this system, is just as important as paying attention to the advantages and disadvantages of the implemented system.