The End-to-End Argument

Several readers of preliminary versions of this paper raised questions about the end-to-end argument and the controversy over causal and total ordering in communication systems (catocs), asking whether our work on Horus sheds new light on these issues [5]. Before we address these issues directly, we should point out that Horus supports everything from best effort delivery to very strong semantics, and users can decide for themselves whether they need causal or total ordering, or not. Moreover, Horus (and several other catocs systems) does provide a true end-to-end mechanism in the form of message stability.

A message is called stable if it has been processed by all its surviving destination processes (that is, the processes that are included in the next view). The term ``has been processed'' is instrumental here. Horus provides a downcall, horus_ack(m), with which the application process informs Horus when it has processed the message . Eventually, this information propagates back to the sender of the message, and onwards to other receivers of the message. It is reported using a STABLE upcall. The upcall contains detailed information about the stability of the messages that a process sent, or received, in the form of a so-called stability matrix. Depending on the application, a message could be considered stable when it has been displayed to a user, logged to disk, when it is safe to delete, etc.

The stability matrix thus reports a property that is completely defined by the application layer. The ``semantics'' of stability data are exactly the semantics determined by the downcalls issued by the application to Horus. We see this as an illustration of the end-to-end paradigm as it is used within Horus: the stability layer provides a mechanism that, under control of the application, may have widely varying meaning.

Back to the concerns that were raised in [5]. Briefly, their use of the end-to-end argument has come under scrutiny from researchers, including ourselves, who favor communication systems that guarantee properties such as virtual synchrony [3], or ordering. The argument favoring ``properties'' is that the complexity of implementing these in the application itself can be daunting, and that, unless properties are standardized throughout a communication framework it will be impractical to extend a system with new applications that depend upon communication properties over time.

One example is an application which is designed to communicate synchronously with a service, but in which replies to the messages being sent are not needed. An application that updates a display maintained by a remote display server matches this model. Provided that the message delivery order and reliability properties are maintained, such an application could gain improved performance by using an asynchronous communication stream. Given an application consisting of a single process, one could simply use a reliable, FIFO protocol such as TCP to communicate with the server. Now, suppose that the application is composed of multiple processes that communicate among themselves-an increasingly common architecture. The FIFO ordering property now generalizes, becoming a requirement for reliable causally ordered message delivery [14]. Given a communication subsystem that supports causal order, the benefit of asynchronous communication can be exploited; lacking it, this performance benefit is not available.

In a superficial sense, Horus could be considered as a contribution to either side of the fence. Because Horus is often used as a library, it will often be linked directly to the application. Configured in this manner, one could argue that Horus is consistent with a philosophy in which the end application implements its own properties, as illustrated by the stability example, above.

However, Horus also employs system-wide services, and provides ordering properties and reliability. Viewed as a runtime environment or a sort of distributed operating system for robust application development, Horus takes on a role of a communication layer and associated services guaranteeing a variety of properties.

In this deeper sense, it could be argued that a system like Horus could not be implemented using an approach fully consistent with the end-to-end philosophy. Although the present paper has not focused on protocols, our previous work has discussed the Horus virtual synchrony implementation in considerable detail. One can view systems such as this as having a three-tier structure. The lowest tier simulates a fail-stop environment (consistent membership tracking with accurate notifications when membership changes occur). The second tier closely resembles a state machine, and implements higher level programming abstractions. In the case of Horus, the abstraction of choice is the virtually synchronous process group, with ordered and failure-atomic multicast (although, as we have stressed, one can easily configure Horus to have other properties, and can selectively enable or disable any of these basic properties). Finally, at the third tier, one finds applications that depend on the consistency properties of the underlying structure.

There are at least three different implementations of the first-tier that would be suitable for use in Horus. The Isis system employed a group membership protocol that provides consistent reporting of system membership changes within a primary partition [8][12]. The Transis and Totem systems implement an extended virtually synchronous addressing model, corresponding to a partitioning model in which the primary partition is distinguished but that also allows progress in non-primary partitions [10]. The Relacs system implements a ``quasi-partial'' view synchrony model. In this approach, concurrent membership views will either be identical or non-overlapping [1]. Currently, Horus can be configured with an Isis-style of primary partition progress restriction, or to support the extended virtual synchrony model. A new membership layer that uses the view synchrony scheme of Relacs can easily be added.

Elimination of the membership agreement mechanism, on the other hand, introduces the risk of potentially serious inconsistencies. For example, we pointed out in Section 7 that liveness of the TOTAL ordering layer is dependent upon the membership service and that the uniqueness of the ordering token is guaranteed by exploiting consistency in the views supplied by MBRSHIP to that layer. Given inconsistent views, TOTAL might not be live, or it might give different message orderings to different endpoints. Horus is thus flexible about the specific partitioning model used, but inflexible about its need for a close approximation to fail-stop behavior.

This leads us back to the end-to-end dispute. Proponents of the end-to-end argument maintain that each application program, or each client-server pair, should cooperate to maintain the properties needed for their particular purpose. In an end-to-end mindset, none of the partitioning and membership options cited above would be acceptable. Each requires a system-wide consensus mechanism for maintaining membership views, closely integrated into all levels of the communication hierarchy. Yet, in the absence of such consensus, it appears to be impossible to provide consistent behavior at the upper tiers of the hierarchy!

We would argue that the onus falls on the end-to-end community to demonstrate meaningful ways to achieve consistency within their paradigm. For example, it is straightforward to implement replicated data, fault-tolerant synchronization, or high availability of critical servers in Horus. Horus achieves the necessary consistency guarantees through ordering and atomicity properties provided by its process group and communication protocols. These, in turn, depend upon the most basic membership agreement mechanisms. We conjecture that such a dependency structure is necessary, and that in its absence, non-trivial consistency guarantees cannot be provided. If we are correct, this would support the conclusion that end-to-end architectures are inherently less powerful than architectures based on a rigorous system membership service.

Next: Performance and Overhead Up: A Framework for Protocol Previous: Reference Implementation Effort

Robbert VanRenesse
Mon May 15 12:16:43 EDT 1995