7 Native C Ensemble Application Interface (CE)
The C application interface is very similar in design to the ML
interface. It is located in directory ce. It has been
modified from the original ML interface, so as to fit better into
the C language (type-system and native data structures).
There are seven callbacks a C application needs to define in order
to work with Ensemble. These are:
-
install(env,ls,vs) : called whenever a new view is installed.
-
exit() :called when the member leaves.
-
receive_cast(env, origin, num, iovl) :
called with the origin, an iovec array (and its length)
whenever a mulicast message arrives.
-
receive_send(env, origin, num, iovl) :
called with the origin, an iovec array (and its length)
whenever a point-to-point message arrives.
-
flow_block(env, origin, onoff) :
called whenever there are flow-control problems, and
the application should refrain from sending messages until further
notice.
-
block(env) :
called whenever a view change is forthcoming. All
applications are blocked, the old view is stabilized,
cleaned, and way is made for the new view.
-
heartbeat(env, time) :
called every timeout. The timeout is specified in the jops
structure. Timers are not exact, this callback may be called at
inaccurate times, or more often than neccessary. If accuracy is
required, the application should check the time argument.
The environment argument which is the first argument in all seven
callbacks is registered when a C-application interface is created.
The types of the callbacks are as follows:
typedef int ce_rank_t ;
typedef int ce_len_t ;
typedef void *ce_env_t ;
typedef double ce_time_t ;
typedef void (*ce_appl_install_t)(ce_env_t, ce_local_state_t*, ce_view_state_t*);
typedef void (*ce_appl_exit_t)(ce_env_t) ;
typedef void (*ce_appl_receive_cast_t)(ce_env_t, ce_rank_t, int, ce_iovec_array_t) ;
typedef void (*ce_appl_receive_send_t)(ce_env_t, ce_rank_t, int, ce_iovec_array_t) ;
typedef void (*ce_appl_flow_block_t)(ce_env_t, ce_rank_t, ce_bool_t) ;
typedef void (*ce_appl_block_t)(ce_env_t) ;
typedef void (*ce_appl_heartbeat_t)(ce_env_t, ce_time_t) ;
A ce_appl_intf_t is the type of a C application interface
(cappl). It can be created by the constructor
ce_create_intf. There is no need for a destructor because Ensemble
frees the interface-structure and all related memory after the exit
callback is invoked. An application interface is opaque, it can be
used to create an endpoint, and join a group. It cannot be used to
join more than a single group.
typedef struct ce_appl_intf_t ce_appl_intf_t ;
The constructor takes the above handlers as parameters, as well as
an environment variable.
ce_appl_intf_t*
ce_create_intf(
ce_env_t env,
ce_appl_exit_t exit,
ce_appl_install_t install,
ce_appl_flow_block_t flow_block,
ce_appl_block_t block,
ce_appl_receive_cast_t cast,
ce_appl_receive_send_t send,
ce_appl_heartbeat_t heartbeat
);
The initial operation used to initiate a CE application is
ce_Init. It initializes the internal Ensemble data structures, and
processes command line arguments.
void ce_Init(int argc, char **argv) ;
After a C application completes initialization it should pass control
the Ensemble main loop via ce_Main_loop.
void ce_Main_loop ();
In order to join a group, the ce_Join operation should be used.
void ce_Join(ce_jops_t *ops, ce_appl_intf_t *c_appl) ;
7.1 Group operations
Similarly to the ML interface, the set of supported operations is:
Leave, Cast, Send, Send1, Prompt, Suspect, XferDone, Rekey,
ChangeProtocol, and ChangeProperties. Messages are arrays of
IO-vectors (iovecs), or C memory chunks. The application can
send and receive iovec-arrays.
Multicast an iovec-array to the group.
void ce_Cast(
ce_appl_intf_t *c_appl,
int num,
ce_iovec_array_t iovl
) ;
Send a point-to-point message to a set of group members.
void ce_Send(
ce_appl_intf_t *c_appl,
int num_dests,
ce_rank_array_t dests,
int num,
ce_iovec_array_t iovl
) ;
Send a point-to-point message to the specified group member.
void ce_Send1(
ce_appl_intf_t *c_appl,
ce_rank_t dest,
int num,
ce_iovec_array_t iovl
) ;
The control actions are the same as the ML actions.
Leave a group. Following this downcall, exit will be called,
freeing the cappl.
void ce_Leave(ce_appl_intf_t *c_appl) ;
Ask for a new View.
void ce_Prompt(
ce_appl_intf_t *c_appl
);
Report specified group members as failure-suspected.
void ce_Suspect(
ce_appl_intf_t *c_appl,
int num,
ce_rank_array_t suspects
);
Inform Ensemble that the state-transfer is complete.
void ce_XferDone(
ce_appl_intf_t *c_appl
) ;
Ask the system to rekey.
void ce_Rekey(
ce_appl_intf_t *c_appl
) ;
Request a protocol change. The protocol_name is a string
specifying the exact set of layers to use. The string is a colon
separated list of layers, for example:
Top:Heal:Switch:Leave:Inter:Intra:Elect:Merge:Sync:Suspect:Stable:
Vsync:Frag_Abv:Top_appl:Frag:Pt2ptw:Mflow:Pt2pt:Mnak:Bottom
void ce_ChangeProtocol(
ce_appl_intf_t *c_appl,
char *protocol_name
) ;
Request a protocol change, specifying properties.
properties is a string containing a colon separated list of
properties. For example:
"Gmp:Sync:Heal:Switch:Frag:Suspect:Flow:Xfer".
The system deduces a protocol stack that abides by these properties.
void ce_ChangeProperties(
ce_appl_intf_t *c_appl,
char *properties
) ;
All arguments to the group-action calls are copied into CE, hence the
application can use the run-time stack to create the arguments. There
is no need to allocate nor free the arguments. There are also
limitations on the sizes of arguments:
-
The maximal size of an iovector array is MAX_SIZE_IOVL.
- The maximal number of destinations for send/send1/suspect
is MAX_NUM_DESTS
- Tha maximal size of a protocol string is: CE_PROTOCOL_MAX_SIZE.
- Tha maximal size of a properties string is: CE_PROPERTIES_MAX_SIZE.
7.2 Integration of other sockets into the main loop
Ensemble works in an event driven fashion, where events can either
come from the network or the user. The system runs a loop that is
split between (1) waiting for input on incoming sockets using a
select system call (2) Processing local
application send/recv and internal events.
The application hands over control to Ensemble after initialization.
The application may wish to wait on its own sockets, e.g., stdin (on
Unix). To this end, we also support adding, removing, and putting
handlers on sockets.
ce_handler_t is the type of handler called when there is input
to process on a socket.
typedef void (*ce_handler_t)(void*);
ce_AddSockRecv adds a socket to the list Ensemble listens to.
When input on the socket occurs, this handler will be invoked
on the specified environment variable.
void ce_AdddSockRecv(
CE_SOCKET socket,
ce_handler_t handler,
ce_env_t env
);
ce_RmvSockRecv is called to remove a socket from the list
Ensemble listens to.
void ce_RmvSockRecv(
CE_SOCKET socket
);
7.3 Memory management
The convention used throughout is that all data-structures passed from
the application CE are copied by CE, and all data-structures passed
from CE to the application are owned by the CE side (hence must not be
freed nor cached). This rule holds for all structures and data apart
from the iovec-arrays.
Ensemble does not copy messages from C to the ML heap, rather, it
separates C-memory and ML memory completely. Messages are received
from the network and read directly into C-buffers. Sent iovecs are
fragmented and sent directly on the network. Messages must be buffered
until all group members reliably receive them. To this end, a
reference counting scheme is used to track iovec liveness. When an
iovec's reference count reaches zero, it is freed. In other words,
iovec's are owned by Ensemble. They are received either from the
user, or the network.
On linux, the type of an iovec is:
typedef struct iovec ce_iovec_t ;
typedef ce_iovec_t *ce_iovec_array_t;
To get better control of the iovec memory system, the alloc and
free functions can be set by the user. The definitions are in
lib/mm.h and lib/mm_basic.h.
These define the types of alloc and free functions.
typedef void* (*mm_alloc_t)(int);
typedef void (*mm_free_t)(char*);
The actual functions called to free and allocate iovec's.
mm_alloc_t mm_alloc_fun;
mm_free_t mm_free_fun;
Use these functions to set alloc and free. Be careful to
do this exactly once at application initialization, before
starting Ensemble.
void set_alloc_fun(mm_alloc_t f);
void set_free_fun(mm_free_t f);
The upshot of this is that when a user sends or casts a message,
Ensemble takes over the message body. When a message is
delivered to the application, the user may copy it, or perform any
read-only operation while in the receive callback. The application may
not modify a received iovec, or assume it owns it.
To use the CE library as a win32 DLL the user must set the alloc
and free functions. This makes the CE library use the application's
allocation and deallocation functions. The simplest choices here are
the standard LIBC malloc and free functions.
7.4 The flat interface
Using iovecs is a little complex for simple applications,
therefore, a simplified ``flat'' interface is also provided.
The flat_receive callbacks take a C memory chunk, with it's length as
arguments. This releases the application from merging together the
set of buffers that consist an iovec-array, as well as releasing that
array.
typedef void (*ce_appl_flat_receive_cast_t)(ce_env_t, ce_rank_t, ce_len_t, ce_data_t) ;
typedef void (*ce_appl_flat_receive_send_t)(ce_env_t, ce_rank_t, ce_len_t, ce_data_t) ;
Create a standard application interface using flat receive callbacks.
ce_appl_intf_t*
ce_create_flat_intf(
ce_env_t env,
ce_appl_exit_t exit,
ce_appl_install_t install,
ce_appl_flow_block_t flow_block,
ce_appl_block_t block,
ce_appl_flat_receive_cast_t cast,
ce_appl_flat_receive_send_t send,
ce_appl_heartbeat_t heartbeat
);
Cast and Send operations that work with buffers instead of iovec-arrays.
void ce_flat_Cast(
ce_appl_intf_t *c_appl,
ce_len_t len,
ce_data_t buf
) ;
void ce_flat_Send(
ce_appl_intf_t *c_appl,
int num_dests,
ce_rank_array_t dests,
ce_len_t len,
ce_data_t buf
) ;
void ce_flat_Send1(
ce_appl_intf_t *c_appl,
ce_rank_t dest,
ce_len_t len,
ce_data_t buf
) ;
7.5 An example
This section shows how to use the CE interface to write applications.
We walk through the ce/ce_mtalk.c demo program.
ce/ce_mtalk.c, similarly to demo/mtalk.ml,
is a multi-person talk program. Messages are read from the user via
stdin, and multicasted to the network.
state_t is the state structure used by the program. It is the
environment variable registered in the C-interface. The state contains
the current view information, a pointer to its cappl, and a flag
indicating if we are blocked.
typedef struct state_t {
int rank;
int nmembers;
ce_endpt_t endpt;
ce_appl_intf_t *intf ;
int blocked;
} state_t;
A helper function to multicast a message if we are not blocked.
We use the flat interface, to save the messy handling of iovec's.
void cast(state_t *s, char *msg){
if (s->blocked == 0)
ce_flat_Cast(s->intf, strlen(msg)+1, msg);
}
A handler for stdin. This callback is called whenever there is input
on the socket. The handler multicasts any message the user types on the
screen. Be careful not to send messages if we are blocked.
void
stdin_handler(void *env)
{
state_t *s = (state_t*)env;
char buf[100], *msg;
int len ;
TRACE("stdin_handler");
fgets(buf, 100, stdin);
len = strlen(buf);
if (len>=100)
/* string too long, dumping it.
*/
return;
msg = (char*) malloc(len+1);
memcpy(msg, buf, len);
msg[len] = 0;
cast(s, msg);
}
There is nothing special to do if we leave the group, the application
essentially halts.
void main_exit(void *env)
When a new view arrives, update the environment structure. The view
structures are owned by the CE library and may not be freed nor taken. To
maintain knowledge of the view by the rank, number of members, and
endpoint named are copied locally.
void main_install(void *env, ce_local_state_t *ls, ce_view_state_t *vs)
{
state_t *s = (state_t*) env;
s->rank = ls->rank;
s->nmembers = ls->nmembers;
s->blocked =0;
memcpy(s->endpt.name, ls->endpt.name, CE_ENDPT_MAX_SIZE);
printf("%s nmembers=%d", ls->endpt.name, ls->nmembers);
}
Ignore flow control problems. We are not suppose to have any of
these, we are very low bandwidth.
void main_flow_block(void *env, ce_rank_t rank, ce_bool_t onoff)
Mark our blocked flag.
void main_block(void *env) {
state_t *s = (state_t*) env;
s->blocked=1;
}
Print out any message that we receive. Be careful not to free the
received message.
void main_recv_cast(void *env, int rank, ce_len_t len, char *msg) {
state_t *s = (state_t*) env;
printf("recv_cast <- %d msg=%s", rank, msg);
}
Ignore send messages, we are not supposed to get any of these.
void main_recv_send(void *env, int rank, ce_len_t len, char *msg) {
}
Ignore heartbeats.
void main_heartbeat(void *env, double time) { }
Create a join options structure, and join the group ``ce_mtalk''.
Use a regular virtually-synchronous stack. Put a handler on
stdin such that whenever there is input, it will be called.
The join options structure can be allocated on the stack since it is
copied internally by CE.
There is no need to set the transport in the join-options structure,
the system uses the environment variable ENS_MODES in this case.
void join() {
ce_jops_t jops;
ce_appl_intf_t *main_intf;
state_t *s;
/* The rest of the fields should be zero. The
* conversion code should be able to handle this.
*/
memset(&jops, 0, sizeof(ce_jops_t));
jops.hrtbt_rate=10.0;
strcpy(jops.group_name, "ce_mtalk");
strcpy(jops.properties, CE_DEFAULT_PROPERTIES);
jops.use_properties = 1;
s = (state_t*) malloc(sizeof(state_t));
memset(s, 0, sizeof(state_t));
main_intf = ce_create_flat_intf(s,
main_exit, main_install, main_flow_block,
main_block, main_recv_cast, main_recv_send,
main_heartbeat);
s->intf= main_intf;
ce_Join (&jops, main_intf);
ce_AddSockRecv(0, stdin_handler, s);
}
The main entry point, initialize the ML side, process command line
arguments, join the ce_mtalk group, and turn control over
to the Ensemble event loop.
int main(int argc, char **argv) {
ce_set_alloc_fun((mm_alloc_t)malloc);
ce_set_free_fun((mm_free_t)free);
ce_Init(argc, argv); /* Call Arge.parse, and appl_process_args */
join();
ce_Main_loop ();
return 0;
}
7.6 Outboard mode
It is possible to run any CE application through a remote Ensemble
server. Such a configuration is called an ``outboard'' configuration.
The idea is to run a daemon on the local host that listens to
TCP connections on a specific port, the daemon provides Ensemble
services to connected clients. Such services include joining/leaving groups,
and sending/receiving multicast and point-to-point
messages on these groups.
A CE application can be configured to run in outboard mode by linking
with the libceo library (suffix .a on Unix, .lib
on WIN32). The user must then make sure that the Ensemble daemon is
running, simply run the ce_outboard executable.
Using a daemon configuration has several benefits as well as some
drawbacks. The advantages are:
-
The library to link with is orders
of a magnitude smaller than the full (inboard) Ensemble library.
- The user-process is completely separated from the Ensemble
server. This allows better debugging, and also facilitates writing simple
interfaces to other languages (e.g., Java, Ada, ...).
The disadvantage is performance loss. Each message now has to travel
through a socket and another process before being sent on the network;
vice-versa for received messages. This may outweigh the benefits of
simple client code, and a minimal sized library.
The current port used by the outboard mode is 5002. This is
configurable by running ce_outboard with the command line
argument -tcp_port <port_num>, and modifying the
OUTBOARD_TCP_PORT parameter in ce/ce_outboard_comm.h.
Care was taken to optimize memory consumption. Messages are sent
zero-copy from the client, and they are copied once only into the
server's buffers. A sent io-vector is consumed by the send
function. Received messages are allocated at the client's buffers and
handed to the application. After the application's receive callback,
io-vectors are released. It was possible at this point to allow the
application to take control of the io-vector, yet we chose to conform
with the memory convections of the inboard mode.
7.7 Thread-safety
A thread-safe version of the library is also provided, it exports the
exact same interface as the basic library. To use it link with
libce_mt.so, or libceo_mt.so. For WIN32 systems link with
.lib instead. The thread-safe library requires the application
to synchronize its threads so they will not perform actions (send,
cast, prompt, etc.) on a group while it is stabilizing. There are
several thread-safe applications under the ce directory:
ce_rand_mt.c, ce_perf_mt.c, and ce_mtalk_mt.c. These applications
use a lock to ensure that sensitive group-state is accessed safely.
Threads atomically check group-state before performing an Ensemble
action.
The thread-safe library is designed as a wrapper around the basic
library. A single thread runs both Ensemble main-loop and application
callback handlers; this thread is known as the Ensemble
thread. Other threads are refered to as user-threads. When a
user-thread performs an action outside of a handler, the action is
stored in a pending queue. A byte is sent through a socket to the
Ensemble thread, notifying it that there is pending work to do.
Asynchronously, the Ensemble thread ``wakes up'', consumes the queue,
and performs all pending actions. Any actions invoked in the interim
will also be stored in the pending queue; to be consumed along with
the rest.
Any action invoked from within a callback is performed directly when
the callback is completed and control returns to Ensemble.
Since a single thread performs the Ensemble main-loop as well as all user
callbacks, callbacks must be short. Long-term computations should
not be performed in the context of a callback.
There are three sensitive periods in which issuing Ensemble actions is
not allowed, these are when joining, leaving, and
blocking. A group is in:
- joining state: between ce_Join and
the first install callback.
- leaving state: between ce_Leave and the exit
callback.
- blocking state: between the block callback and the
succeeding install callback.
An example of a simple multi-threaded application is provided in
ce/ce_mtalk_mt.c.
The overhead of adding thread-safety is 10% in the worst case, and
normally much less than that. This should be acceptable for most
applications.
7.8 A multi-threaded multi-person chat program
This program is a multi-threaded version of ce_mtalk.c
Here, we walk through it and explain the interface and how to
use it.
Include the system-independent thread header file, so we'll be
able to use locks.
#include "ce_trace.h"
#include "ce.h"
#include "ce_threads.h"
#include <stdio.h>
#include <memory.h>
#include <malloc.h>
The NAME variable is used for internal tracing purposes of
CE. There is no need to set it for standard user programs.
#define NAME "CE_MTALK_MT"
Apart for standard view state, the state structure keeps track
of the current status of the group: blocked, joining, or leaving.
typedef struct state_t {
int rank;
int nmembers;
ce_endpt_t endpt;
ce_appl_intf_t *intf ;
int blocked;
int joining;
int leaving;
ce_lck_t *mutex;
} state_t;
Although we must define these callbacks, they do nothing in this
program.
void main_exit(void *env)
{}
void
main_flow_block(void *env, ce_rank_t rank, ce_bool_t onoff)
{}
void
main_recv_send(void *env, int rank, ce_len_t len, char *msg)
{}
void
main_heartbeat(void *env, double time)
{}
main_install updates the view state. A lock must be taken to
protect view state, as other threads may concurrently read the state.
void
main_install(void *env, ce_local_state_t *ls, ce_view_state_t *vs)
{
state_t *s = (state_t*) env;
ce_lck_Lock(s->mutex); {
s->rank = ls->rank;
s->nmembers = ls->nmembers;
s->blocked =0;
memcpy(s->endpt.name, ls->endpt.name, CE_ENDPT_MAX_SIZE);
s->blocked =0;
s->joining =0;
printf("%s nmembers=%d", ls->endpt.name, ls->nmembers);
TRACE2("main_install",ls->endpt.name);
} ce_lck_Unlock(s->mutex);
}
The group is blocked, lock the state structure, and update the blocked
flag. This notifies other threads not to attempt sending messages
until the upcoming install callback. A lock must be taken to protect view
state, as other threads may read it.
void
main_block(void *env)
{
state_t *s = (state_t*) env;
ce_lck_Lock(s->mutex); {
s->blocked=1;
} ce_lck_Unlock(s->mutex);
}
Received a message, print who sent it and its content.
void
main_recv_cast(void *env, int rank, ce_len_t len, char *msg)
{
printf("%d -> msg=%s", rank, msg); fflush(stdout);
}
get_input is a non-terminating function performed by the user-thread of
this program. In an infinite loop, read a line from stdin,
and multicast it to the group. Prior to sending, check that the group is not
blocked/joining/leaving. Status flags are shared information, and
may be updated concurrently by an install or block
callback. Hence, a lock is taken to protect access to the flags.
void
get_input(void *env)
{
state_t *s = (state_t*)env;
char buf[100], *msg;
int len ;
while (1) {
TRACE("stdin_handler");
fgets(buf, 100, stdin);
len = strlen(buf);
if (len>=100)
/* string too long, dumping it.
*/
return;
msg = ce_copy_string(buf);
TRACE2("Read: ", msg);
ce_lck_Lock(s->mutex); {
if (s->joining || s->leaving || s->blocked)
printf("Cannot send while group is joining/leaving/blocked");
else {
ce_flat_Cast(s->intf, strlen(msg)+1, msg);
}
} ce_lck_Unlock(s->mutex);
}
}
Initialize the state structure, and join the ``ce_mtalk'' Ensemble group.
Take care to initialize the lock, and set the joining flag. The flag
will be unset, allowing sending messages, in the first install callback.
state_t *
join(void)
{
ce_jops_t jops;
ce_appl_intf_t *main_intf;
state_t *s;
/* The rest of the fields should be zero. The
* conversion code should be able to handle this.
*/
memset(&jops, 0, sizeof(ce_jops_t));
jops.hrtbt_rate=3.0;
strcpy(jops.group_name, "ce_mtalk_mt");
strcpy(jops.properties, CE_DEFAULT_PROPERTIES);
jops.use_properties = 1;
s = (state_t*) malloc(sizeof(state_t));
memset(s, 0, sizeof(state_t));
main_intf = ce_create_flat_intf(s,
main_exit, main_install, main_flow_block,
main_block, main_recv_cast, main_recv_send,
main_heartbeat);
s->intf= main_intf;
s->mutex = ce_lck_Create();
s->joining = 1;
ce_Join (&jops, main_intf);
return s;
}
Initialize Ensemble, start the reader thread, and go to sleep.
int
main(int argc, char **argv)
{
state_t *s;
ce_set_alloc_fun((mm_alloc_t)malloc);
ce_set_free_fun((mm_free_t)free);
ce_Init(argc, argv); /* Call Arge.parse, and appl_process_args */
/* Join the group
*/
s = join();
/* Create a thread to read input from the user.
*/
ce_thread_Create(get_input, s, 10000);
ce_Main_loop ();
return 0;
}
7.9 The Join Options structure
The ce_jops_t structure contains all the options an application
joining Ensemble wishes requests of the created endpoint. All string
arguments use a fixed sized char array which should look like a
C-string: initial set of readable ASCII characters followed by zeros.
There has to be a terminating NULL character. To simplify C memory
management all string arguments use a fixed sized char arrays.
Boolean arguments use zero for false and one for true.
typedef struct ce_jops_t {
ce_time_t hrtbt_rate ;
char transports[CE_TRANSPORT_MAX_SIZE] ;
char protocol[CE_PROTOCOL_MAX_SIZE] ;
char group_name[CE_GROUP_NAME_MAX_SIZE] ;
char properties[CE_PROPERTIES_MAX_SIZE] ;
ce_bool_t use_properties ;
ce_bool_t groupd ;
char params[CE_PARAMS_MAX_SIZE] ;
ce_bool_t client;
ce_bool_t debug ;
ce_endpt_t endpt;
char princ[CE_PRINCIPAL_MAX_SIZE] ;
char key[CE_KEY_SIZE] ;
ce_bool_t secure ;
} ce_jops_t ;
- hrtbt_rate: The rate of heartbeat callbacks. A resonable
setting would be seconds, or hundreds of milliseconds.
- transports: Which transports to use want the endpoint to
use. For example: ``DEERING'', or ``UDP:TCP''.
- protocol: Which protocol to use. This allows setting by hand
the stack (set of layers) used by the endoint. Not for casual use.
- group_name: What is the name of the group to join.
- properties: What is the set of properties the endpoint stack
should adhere to. This is the prefered way of creating endpoints,
set the use_properties flag to 1 in this case. The default
setting is "Gmp:Sync:Heal:Switch:Frag:Suspect:Flow:Slander". This
includes Group membership (Gmp:Heal).
- groupd: Should the endpoint use an external groupd membership
manager?
- debug: Set this flag to use a debugging stack. This
considerably degrades performance. This is only useful if there is a
possiblity that the stack itself has a bug.
- endpt:
Normally, Ensemble generates a unique endpoint name for each
group an application joins (this is what happens if you leave
'endpt' unmodified). The application can optionally provide
its own endpoint name. It can, for instance, reuse an
endpoint name generated by Ensemble for another group (the
same endpoint name can be used to join any number of groups).
The application can even generate an endpoint on its own.
Such names should be unique. It is best if they contain only
printable characters and do not contain spaces because
Ensemble my print them out in debugging or error messages.
(The names generated by Ensemble fit these characteristics.)
See ensemble/type/endpt.mli for more information.
- princ: The principal name of this endpoint. Used by the
security code.
- key: The encryption and MAC keys used by Ensemble. If the stack
is secure (authenticated and encrypted), then it uses two symmetric
keys: one for encryptions and one for MACing messages. The key
field allows the user to set these initial keys. The key is of size
(exactly) 32 bytes, the first 16bytes are used as the initial encryption
key, and the last 16bytes are used as the initial MAC key.
- secure: Should we use a secure stack. A secure stack is one
that authenticates endpoints and MACs and encrypts messages. The
default encryption mechanism is RC4, the default MAC algorithm is
keyed MD5.
7.10 The view structure
The view is split into pieces: the local view, describing the local
state of the endpoint, and the group view, describing the global state
of the group.
The local view is defined as:
typedef struct ce_local_state_t
ce_endpt_t endpt ;
ce_addr_t addr ;
ce_rank_t rank ;
char name[CE_NAME_MAX_SIZE];
int nmembers ;
ce_view_id_t view_id ;
ce_bool_t am_coord ;
ce_local_state_t ;
- endpt: The endpoint name.
- addr: The address of this endpoint. This is of form ``Udp(128.25.4.98)''.
- rank: The rank of the member in the group. Member ranks are
between 0 and the (number of group members)-1. Ranks are
decided internally by Ensemble, and each member has a unique rank within
a group.
- name: The full name of the endpoint: includes the rank,
endpoint name and the logical time (internal to Ensemble).
- nmembers: The number of members in the group.
- view_id: A unique id for the group, provided by Ensemble.
- am_coord: Is this endpoint the group coordinator?
The group view is defined as:
typedef struct ce_view_state_t
char version[CE_VERSION_MAX_SIZE] ;
char group[CE_GROUP_NAME_MAX_SIZE] ;
char proto[CE_PROTOCOL_MAX_SIZE] ;
ce_rank_t coord ;
int ltime ;
ce_bool_t primary ;
ce_bool_t groupd ;
ce_bool_t xfer_view ;
char key[CE_KEY_SIZE] ;
int num_ids ;
ce_view_id_t *prev_ids ;
char params[CE_PARAMS_MAX_SIZE];
ce_time_t uptime ;
ce_endpt_t *view ;
ce_addr_t *address ;
ce_view_state_t ;
- version: The distribution version of this Ensemble library.
- group: The group name.
- proto: The protocol stack used.
- coord: What is the rank of the group coordinator. Currently,
this is always member 0. We do not expect this to change in the
future.
- ltime: The logical time of the view. Ensemble maintains this
counter for virtual-synchrony purposes.
- primary: Is this a primary partition? This flag matters only if
the stack includes a primary partition layer.
- groupd: Are we using the group-daemon?
- xfer_view: Is this view a state-transfer view? this flag
matters only if the stack includes a state-transfer layer (Xfer).
- key: What is the security key. This field is non-zero only
when the stack is a secure stack.
- num_ids: The number of previous views that merged together to
create the current view.
- prev_ids: An array of view ids, one for each of the views that
merged to create the current view.
- params: The parameters provided for this endpoint.
- uptime: The amount of time this endpoint has been running.
- view: A array of size nmembers of the endpoint ids in the group.
- address: A array of size nmembers of the addresses of
endpoints in the group.
7.11 Notes
Of the four transports supported by Ensemble : NETSIM, UDP, TCP, and
DEERING, NETSIM is not supported for the thread-safe library. A socket
is used internally, and NETSIM does not allow any
external communication. Hence, it is unsupported.
To maintain compatibility with the non-thread-safe version,
ce_Main_loop should send the calling thread to sleep forever. It
creates a semaphore and sleeps on it. The actual Ensemble thread is
spawned by the ce_Init call. An application that wants to run
the Ensemble main loop in a separate thread needs to call
ce_Init without calling the ce_Main_Loop.