Design



next up previous
Next: Layers and Protocols Up: Design and Performance of Previous: The Horus Group

Design

Horus implements the group model discussed in the previous section in an extensively layered and highly reconfigurable manner. This design allows applications to pay only for those aspects of the group model they need. Much of Horus design has been inspired by concepts from various modern systems such as microkernel operating systems, the x-kernel, and object-oriented systems.

Microkernel operating systems support a limited number of basic primitives at a kernel level and more sophisticated services on higher levels allowing flexibility. In Horus, a small number of basic primitives have been identified and provided by its microkernel, called MUTS. We will describe MUTS briefly in section 5.

Our overall approach resembles the x-kernel, which is a framework for implementing network protocols. In the x-kernel, each protocol implements a simple feature, and the protocols can be tied together in a graph to support the needs of the application. Horus adopts this idea, but its interface is more suitable to multicast protocols. Horus can be integrated into the x-kernel, if desired, in which case it is best viewed as an extention of x-kernel specialized for the case of process groups and group multicast, but can also be run inside of, or over, other operating systems.

In object-oriented systems, sophisticated objects are derived from basic objects. In Horus, a simple ``basic group'' can be extended with features such as message ordering or flow control. The basic group does not provide virtually synchronous views, or even reliable message passing. Instead, each basic group has an identifier and a current membership view. Each group member can have its own view of the group, and is responsible for maintaining and installing new membership views, for example when other members join, leave or fail. In the basic group multicast protocol, messages are delivered on a best effort basis. This type of basic group is supported transparently over a wide variety of different networks, such as Ethernet and ATM, and is optimized for maximum performance.

Over the basic group there are some fifteen features that can be selectively added to change the semantics of group view reporting and group communication. Each feature is coded as a light-weight software layer that can be added dynamically (i.e., at run-time). Normally, one sets up a set of layers for a given group and then leaves them unchanged, and all the group members use the same layers. (To a limited degree, features can be selectively enabled or disabled on a per-multicast basis, but this requires special knowledge of how the layers work.) The intent is that, by stacking a particular set of layers, a group can be tailored to the needs of the application (see figure 1). This flexibility can then be hidden behind simple user-oriented interfaces that require no special knowledge of how Horus is really configured.

  
Figure: Layers can be stacked at run-time like Lego.9extm blocks.

At the time of this writing, the major Horus layers implement FIFO message passing, fragmentation/reassembly, virtual synchronous membership and communication, flow control, causal order, total order, and primary bit maintenance. All these, and future layers support the same interface, called the Uniform Group Interface (UGI). The UGI is well-defined, and consists of a set of downcalls and upcalls, and has some support for extension of the interface. The interface provides for, among others, multicasting messages, installing views, and reporting error conditions. The UGI is designed for multiprocessing, and is completely asynchronous and reentrant. See tables 1 and 2 for a complete list of upcalls and downcalls. The UGI is fundamental, allowing users total flexibility in stacking the layers.

  
Table 1: Horus downcalls

  
Table 2: Horus upcalls

The topmost layer will typically offer an application-dependent interface rather than the UGI (the UGI is the most primitive interface to Horus, and is used primarily by protocol developers.) The usual topmost interface is the standard BSD socket interface-an interface that is well known to most application programmers. To join a group, a user creates a socket in the Horus addressing domain, and binds it to a Horus group address. Sending and receiving messages can be done using the normal read and write system calls. It is possible to mix UNIX file descriptors, TCP socket descriptors, and Horus socket descriptors and apply a BSD select call. We also provide a reliable multi-threaded UNIX system call library with this interface, allowing multiple reads to execute in parallel. The UNIX ioctl system call provides control over group properties, such as message ordering and the degree to which and how events such as new group views will be reported to the application program.

Other topmost interfaces include the Transis interface, and the (ORCA) Panda interface (allowing parallel ORCA programs to run over Horus). We are currently developing an interface to support the Message Passing Interface standard that has been developed as a follow-on to PVM for parallel computing, and a Tcl/TK interface. Real-time and object-oriented language interfaces are planned for the future.



next up previous
Next: Layers and Protocols Up: Design and Performance of Previous: The Horus Group



Robbert VanRenesse
Tue Nov 15 12:09:10 EST 1994