next up previous contents
Next: Protocol layers that work Up: JavaGroups User's Guide Previous: Using deadlock detection with

   
Implementing view changes

This chapter discusses in detail the interactions (events exchanged and actions taken) between protocol layers that happen when a new view needs to be installed in all members. A new view is created in the RpcGMS layer (1) when a new member joins the group, (2) a member leaves the group or (3) a faulty member has been detected (e.g. by a suspicion service). The latter case is the same as when a member leaves and the same actions will be taken.

According to virtual synchrony [Bir96], all messages sent in the current view have to be seen by all members before a new view can be installed. This is also called flushing all messages out of the system and is implemented by a flush protocol. It essentially works as follows: when a new view is to be installed, the coordinator sends a FLUSH message to the group. Each member sends its pending messages and then stops sending messages (until the new view has been received). It also returns all messages to the coordinator which are not known to be stable. (Stable messages are ones which are known to have been received at every member). If a member has crashed during the FLUSH protocol, a new round of the FLUSH protocol is started immediately. When the coordinator has received all responses it merges the unstable messages, removing duplicates, and re-sends them, making sure that every member receives them.6.1 This is necessary to ensure atomicity, i.e. all members receive all messages. This could for example be violated if a member P wants to send a message m1 to all group members using multiple unicasts, but crashes after sending the message only to member Q. The FLUSH protocol ensures that Q returns m1, so m1 will be broadcast to all members (Q just discards the duplicate) in the current view. Finally the new view is broadcast to every member. Messages sent from now on are delivered in the new view. Messages received from the previous view are discarded.

There are basically 4 protocol layers that implement the above scheme: RpcGMS, FLUSH, NAKACK and STABLE. Each layer implements only a specific functionality and 'invokes' other functions by sending events to different layers. RpcGMS is the protocol that invokes the view change mechanism: it takes care of triggering the FLUSH protocol and broadcasting the new view. FLUSH implements the FLUSH protocol. NAKACK ensures that all messages within a view are ordered correctly (FIFO order): it does this using a NAK-based scheme, but can be switched to an ACK-based scheme when necessary (e.g. to reliably deliver view changes, whereas regular messages use NAKs). Finally, STABLE uses a gossipping mechanism to determine which messages have been received at all members, so that they can be garbage-collected. The following sections describe the 4 protocols, and the interactions taking place when a view is installed.



 
next up previous contents
Next: Protocol layers that work Up: JavaGroups User's Guide Previous: Using deadlock detection with

1999-12-13