![]() |
Maestro Open Toolkit: Clients/Servers + State Transfer Interface |
The state-transfer protocol of Maestro is implemented in the Maestro_ClSv class. The interface to state transfer provided in Maestro_ClSv is, however, at a low level, and leaves some details to be taken care of by the application. In particular, a joining server has to choose an old server from which to request the state, and it has to replay the request should the chosen server crash during state transfer. Also, a joining server may need to terminate state transfer and possibly restart it in some scenarios, which makes it necessary to be able to distinguish between different state-transfer transactions -- all of which would have to be implemented at the application level if using Maestro_ClSv directly. Alternatively, the application can be built above the Maestro_CSX class, which provides additional state-transfer functionality behind a higher-level interface.
It may happen (as a result of group partitioning and merging) that state transfer will be (re)started more than once at a given server. On the other hand, a state transfer may be terminated before completion (if all old servers crash or partition away during state transfer). Since it is possible that a new state transfer will be started before the completion of a previous one, Maestro_CSX assigns a unique ID to every state transfer transaction so that the application can distinguish between them.
When state transfer needs to be (re)started, the stateTransfer_Callback method is invoked. If Maestro is run in the multi-threaded mode, stateTransfer_Callback is called in a separate thread. However, in the single-threaded mode stateTransfer_Callback is invoked in the same (Ensemble) thread as all other callbacks.
A joining server can request (a portion of) the state form an old server with a getState downcall. There are two versions of getState, a blocking and a non-blocking one. In the multi-threaded mode, both versions of getState can be used. However, the non-blocking version of getState is the only choice when running Maestro in the single-threaded mode.
A call to getState made at a joining server results in an invocation of the askState_Callback method at a normal ("old") server, which should eventually respond by sending a state message to the joining server with a call to the sendState function. When the joining server receives the state message, the corresponding call to getState returns (if the synchronous/blocking version of getState was called) or the gotState_Callback is invoked (in the asynchronous/non-blocking case).
If a state-transfer transaction is terminated while the joining server is still waiting for a state message from an old server, the call to getState will return with the abnormal-termination status (in case of a blocking call), or else the xferCanceled_Callback method will be invoked (in case of a non-blocking call).
Once the joining server has completed state transfer, it should invoke the xferDone method. Following that, a new view will eventually be installed, where the joining server will be included in the list of "normal" servers.
The interface to the state transfer functionality provided by Maestro_CSX
is described in sections below.
void resetState();
void stateTransfer_Callback(Maestro_XferID &xferID);
Upon completion of state transfer, the application must notify Maestro by calling xferDone with the same value of xferID as the one passed to the corresponding invocation of stateTransfer_Callback.
void getState(Maestro_XferID &xferID, Maestro_Message &requestMsg, /*OUT*/ Maestro_Message &stateMsg, /*OUT*/ Maestro_XferStatus &xferStatus);
When a call to getState returns, the value of the xferStatus argument will be equal to MAESTRO_XFER_OK if the state request has succeeded, and MAESTRO_XFER_TERMINATED if the transfer has been prematurely terminated (usually because of a group merge or a total failure/partitioning away of all normal servers). If state transfer has been terminated, stateTransfer_Callback should return without further attempts to get the state.
The blocking version of getState can only be used when Maestro is running in the multi-threaded mode. In the single-threaded mode, the non-blocking version of getState must be used.
void getState(Maestro_XferID &xferID, Maestro_Message &requestMsg, /*OUT*/ Maestro_XferStatus &xferStatus);
The application can make as many calls to getState as necessary. However, the getState function is not reentrant. Furthermore, after a call to (non-blocking) getState has been made, a subsequent invocation of the function can only be made after the portion of the state requested in the previous call to getState has been received (with a matching invocation of the gotState_Callback method).
Also note that in all invocations of getState, the value of the xferID argument must be the ID of the current state-transfer transaction (which is the value of the xferID argument passed to the corresponding invocation of stateTransfer_Callback).
The non-blocking (asynchronous) version of getState can be used when Maestro is running in either multi-threaded mode or single-threaded mode, and is the only option in the latter case. However, in the multi-threaded mode, the blocking (synchronous) version of getState can also be used.
void askState_Callback(Maestro_EndpID &origin, Maestro_XferID &xferID, Maestro_Message &requestMsg);
Each invocation of askState_Callback must eventually be followed by a call to the sendState function, which sends (the requested portion of) the state to the joining server.
void sendState(Maestro_EndpID &dest, Maestro_XferID &xferID, Maestro_Message &stateMsg);
void gotState_Callback(Maestro_XferID &xferID, Maestro_Message &stateMsg);
void xferCanceled_Callback(Maestro_XferID &xferID);
void xferDone(Maestro_XferID &xferID);