Performance assessment

This page describes Ensemble performance, including the security subsystem. Testing was carried out in year 2000, since then, hardware (CPU, memory) and software (Linux, Solaris, win32 kernels) have improved substantially. Nevertheless, these measurements provide some incite into the system's capabilities.

Our test bed is a set of 20 PentiumIII 500Mhz Linux2.2 machines, connected by a switched 10Mbit/sec Ethernet. Machines were lightly loaded during testing, and the network was, for the most part, clear of other traffic. Throughput and latency were measured. Each measurement was taken three times, once with a standard, insecure stack (REG), second, with an authenticated stack (AUTH) , and third, with an authenticated encrypted stack (SECURE). The encryption used was the default RC4.


(1)
(2)
(3)

Figure 1: A standard stack is denoted REG. An authenticated stack is denoted AUTH. An authenticated encrypted stack is denoted SECURE. (1) send-recv latency in the Ensemble stack. (2) Total Latency for point-to-point send/recv. (3) Throughput.


Figure 1(1) shows the latency for a send/recv operation inside the stack. This is a ``ping'' test in which a message is received, and an immediate response is sent back to the origin. The amount of time spent inside the Ensemble stack is measured. As we can see, the regular and authenticated stack are quite close, meaning that the computational overhead of an MD5 hash over a message is not significant. On the other hand, the encrypted stack is relatively expensive. As message size grows the computation required grows. For a 900 byte message the latency is 140 microseconds. Note that the base line is a low and constant 24 microseconds. Hence, the basic overhead imposed by the system is very low. Furthermore, the cost difference between the different stacks is almost entirely due to the MAC and encryption algorithms, not to the layering structure.

Figure 1(2) shows the latency of a ``request/response'' scenario. Two machines using Ensemble are used. The initiating machine sends a point-to-point message to the second machine, which sends back an immediate response. This scenario was repeated 1000 times, and various message sizes were used. A comparison with the Unix ping utility was also included. As we can see, the difference between the three stacks is not very significant. Furthermore, the standard stack is fairly close to ping. We conclude that the latency is mostly due to the network and operating system.

Figure 1(3) shows the throughput achievable in a lightly used network. The maximal bandwidth in a 10Mbit/sec Ethernet is 1.2Mbyte/sec. Out of the maximum, throughput of 750Kbyte/sec can be achieved by a single sender for a 21 member group. The loss of bandwidth is attributed to the high level of guarantees provided. In order to achieve reliability, one must perform retransmissions, to achieve sender-order multicast, messages must be numbered, for flow-control, a back-off protocol must be used etc.

1   Rekeying performance

Ensemble has a substantial security sub-system, which leverages off-the-shelf security packages to create a secure environment for group members. Part of the security infrastructure is the rekeying facility that allows the application to ask Ensemble to switch encryption and MAC keys. Here we depict the performance of two of the algorithms employed by the system. For more details, see the papers section.

Figures 2, 3 describes a test performed on our set of 20 machines. To create groups larger than 20, several processes were run on the same machine. A large number of member join/leave operations was performed, and rekey times were clocked. To simulate real conditions, we flushed the cache once every 30 rekey operations, and discarded ``cold-start'' results.




Figure 2: Performance of the dLKH algorithm.





Figure 3: Performance of the Diamond algorithm.


These figures show rekey times, in milliseconds, as a function of group size. As we can see, the Diamond algorithm, which is specifically geared for single process failures, is much faster than the more general dLKH. However, dLKH will outperform Diamond in the event of multiple failures.


This document was translated from LATEX by HEVEA.