Andre Allevena, University of Waterloo

The need for efficient computation of functions of global state
lies at the heart of a wide range of problems in distributed
systems. Examples include load balancing, leader election and barrier
synchronisation, distributed file system maintenance, sensor fusion,
coordinated intrusion detection, and top-K queries in stream-oriented
databases. More precisely, we wish to compute a function $g$ on the
values held by the nodes. In practice, computing such functions
involves in-network computation of partial functions, to avoid the
overhead of exchanging Ω(*n*) sized messages.

Epidemic (or gossip) algorithms are known as the most robust and efficient way to disseminate information in a network. However, they suffer from the "double counting" problem. Solutions to this problem either make gossip brittle or depend on variants of the Flajolet-Martin algorithm, which results in errors of up to 50%.

In this paper, we present the first efficient, accurate and robust
algorithm for in-network computation of functions of global state. We
provide a simple abstraction of a B-tree on which arbitrary functions
can be evaluated. The scheme is fast and efficient: it terminates in
O(ln *n*) communication rounds and every node sends most of its
messages to nearby nodes. Most useful functions like sum or approximate
histogram can be embedded in the tree, resulting in messages of size
O(ln *n*). For functions that need to be distorted to be embedded
into the tree, the price is not loss of precision but larger messages,
of size proportional to the degree of distortion. Our analysis includes
bounds on degree of load imbalance and tolerance of node failures.

Joint work with S. Keshav.