Sam Toueg

1998 - 1999 CS Annual Report Faculty

Sam Toueg

Professor
sam@cs.cornell.edu
http://www.cs.cornell.edu/home/sam/sam.html

PhD Princeton University, 1979

My research is in distributed computing. In particular, I work on methodologies, paradigms, and algorithms for highly-available and secure distributed systems. My long-term goal is to help bridge the gap between theoretical results and the need for efficient and practical solutions.

My recent research effort, in collaboration with M. Aguilera and W. Chen, is on the use of unreliable failure detectors for

designing reliable distributed systems. We studied the problems of failure detection and consensus in asynchronous systems in which processes may crash and recover, and links may lose messages. We first proposed new failure detectors that are particularly suitable to the
crash-recovery model. We next determined under what conditions stable storage is necessary to solve consensus in this model. Using the new failure detectors, we gave two consensus algorithms that match these conditions: one requires stable storage and the other does not. Both algorithms tolerate link failures and are particularly efficient in the runs that are most likely in practice —those with no failures or failure detector mistakes.

University Activities

Director: Master of Engineering Program, Computer Science Department

Publications

Randomization and failure detection: A hybrid approach to solve Consensus. SIAM
Journal of Computing 28, 3 (June 1999), 890-903 (with M. Aguilera).

Using the Heartbeat failure detector for quiescent reliable communication and consensus in partionable networks. Invited paper in Theoretical Computer Science 220, special issue on Distributed Algorithms, 1 (June 1999), 3-30 (with M. Aguilera and W. Chen).