Introduction



next up previous
Next: The Horus Group Up: Design and Performance of Previous: Design and Performance of

Introduction

In the last several years, we have seen a growing use of group communication primitives in distributed and/or parallel applications. Physically distributed systems (such as stock markets and factories) employ group communications to disseminate information to large numbers of clients. Parallel systems use group communications to allocate jobs among slave worker processes and to exchange intermediate results. Fault-tolerant systems use group communications to propagate updates to replicas and to collect vote quorums for distributed decision making. Standards for group communication are under study in the X/Open and IEEE communities, and a group-oriented parallel communication standard (MPI) was introduced in 1993.

Not surprisingly, many distributed operating systems have offered group communication mechanisms in addition to more conventional RPC and message streams mechanisms [13][6][4]. Unfortunately, with the exception of Amoeba [6], these systems provide limited support for reliability and consistency which are crucial to long running or life critical applications. (Amoeba has a different disadvantage of lack of portability to other vendor-developed systems.) At the same time, however, these distributed operating systems have demonstrated benefits of microkernel software architectures. The principle of microkernel architectures is to provide a small number of basic primitives at a low (kernel) level and to leave sophisticated services to higher levels (in user space). The microkernel approach results in flexible and extensible system functionality, such as customizable memory management and multiprocessor scheduling.

Although group protocols can be implemented in user space, as was done in Isis [3], such a configuration results in suboptimal performance. There are three reasons for this. First, the communication systems implemented at the non-privileged user level cannot always exploit multicast primitive available at underlying network level (such as Ethernet, FDDI, and some ATM switches). Second, many operating systems implement resource management and communication buffering policies that were not designed for group communication protocols and perform poorly when used by such protocols. Third, having a system at in user space results in more context switches and cross-address space references (see [15]).

The challenge, then, is to develop a group communication system that can run either in user space or in a microkernel. To accomplish this goal, a group communication system must minimize complexity, but still maximize both performance and flexibility. This is not straightforward: protocols to support large numbers of groups, dynamic group membership, message ordering, synchronization and failure handling can be complex.

We met this challenge by the design of a portable group communication subsystem called Horus. Horus has few system dependencies, and can be incorporated in modern distributed operating systems as either a user-level service or kernel-level subsystem, or both. Compared to its parent system Isis, Horus is smaller, provides more flexibility, and performs better. It also offers security features, and is able to deal with network partitioning. The system design integrates ideas developed in Isis, Transis [2], and the x-kernel [9].

This paper discusses the architecture and implementation of Horus, reviews the interfaces supported (notably, an interface in which the cost of the protocols supporting a communication group can be varied depending on the properties desired by the user), and presents performance figures for a version of the system running in user-space over UNIX.9extm.



next up previous
Next: The Horus Group Up: Design and Performance of Previous: Design and Performance of



Robbert VanRenesse
Tue Nov 15 12:09:10 EST 1994