The extensive use of layering raises important performance issues in Horus. On the one hand, the layering improves performance, since applications can choose the minimal stack for their requirements. For example, an application can decide whether or not it needs end-to-end guarantees, and, if so, whether STABLE or PINWHEEL will be optimal. Also, because each layer is small and simple, they can easily and effectively be optimized individually. Although the performance of Horus currently compares very favorably to other systems (see [15]), performance could still be improved. The performance of the current system suffers for the following reasons:
We have no detailed overhead measurement, but can report that on a Sparc 10
the overhead of the fragmentation/reassembly layer FRAG (which only needs
one bit of header space) adds about 50 secs to the one-way latency,
which is considerable.
We believe we could bring this down somewhat by more careful coding, but
we are working on more rigorous solutions to each of these problems.
For the first problem, we will avoid unnecessary invocations of a layer, skipping layers that take no action on the way down or up. We also envision that it will be possible to take common substacks of protocols, and (from the reference implementation) create one single production layer. Ideally, a compiler might implement optimizations such as these.
To address the second problem, we are eliminating intra-stack threading, having discovered that concurrency within a stack does not lead to significant gains. This way we can reduce the use of locks and the frequency of thread creation, except when entering a stack from the top or bottom. Since synchronization between stacks is seldom necessary, we can still run each stack within its own thread.
For the last problem, we are changing the protocol implementations. A protocol will specify, instead of the layout of their header, the fields that it needs (in terms of size and alignment, both specified in bits). When building a stack, Horus will precompute a single header in which the necessary fields are compacted. This should reduce wasted space on a message to a minimum, and eliminate the header push and pop operations currently used by most layers.