![]() |
State Transfer in Maestro |
The state-transfer functionality is provided in Maestro by two classes: Maestro_ClSv implements a state-transfer protocol (which is discussed below) and offers a low-level interface to it. The Maestro_CSX class offers a higher-level interface to state transfer; users of Maestro will want to use it in most cases.
The implementation of state transfer in Maestro follows the "pull" paradigm, in which a joining server asks old server(s) for pieces of the global state, and decides by itself when state transfer has been completed. In the "push" paradigm (used in Isis), it is a responsibility of old servers to decide how to transfer their state to the joining server, and how to handle failures during state transfer. It appears that the "pull" approach is simpler to implement and is more flexible.
In sections below, we show how the state is transferred in two important situations (new server joining the group, and two group partitions merging), and discuss the state transfer protocol.
When X becomes an xfer-server, it starts state transfer. The state is transfered in a series of read requests to normal servers (Stage 4). When all the state has been received, X sends an xferDone message to the coordinator (stage 5). The coordinator flushes the group and installs a new view, in which X is included in the list of servers. At that point, X becomes a normal server.
When two partitions merge, the servers in one of them will formally lose their state and become "degraded" to the client status. At that point all degraded servers will restart state transfer from the other partition. It is a responsibility of the application to actually transfer the state and notify Maestro of state-transfer completion. The role of Maestro is to notify the application when state transfer is deemed necessary, and to specify the direction of the transfer. The state-transfer model of Maestro thus assumes that during a view merge there is a transfer of state from one partition to the others, rather than a merge of states from individual partitions. This model is justified for an important class of applications (namely, those requiring the primary-view execution). The state transfer protocol of Ensemble is more general in allowing merge of states from several partitions. We are planning to eventually extend the Maestro interface to support both state-merge and state-transfer models.
Here is what happens when a group merge occurs. Initially, there are two partitions with possibly different global states (Stage 0). Eventually, the two partitions merge. When that happens, all members of one of the partitions (servers, xfer-servers, and clients) suddenly find themselves in the clients list (Stage 1). That triggers a restart of state transfer by all degraded servers (Stage 2). When state transfer completes, the servers list of the merged group includes servers from both partitions (Stage 3).
The implementation of state transfer in HOT attempts to deal with these problems in the following way. Each group member has a membership type, which is either Client or Server. After the join downcall is invoked, a group member is always in one of the following states:
JOINING CLIENT_NORMAL BECOMING_SERVER SERVER_XFER SERVER_XFER_DONE SERVER_NORMAL
Then it is guaranteed that the state of the member will eventually become SERVER_NORMAL. After that, whenever the state of the member becomes different from SERVER_NORMAL, it will eventually become SERVER_NORMAL again.