Robbert van Renesse's Home Page

433 Gates Hall

Ithaca, NY 14853

Tel +1-607-255-1021

rvr@cs.cornell.edu

I am a Professor in the Department of Computer Science at Cornell University in Ithaca, NY. I'm a member of the Systems and Networking group. I'm interested in distributed systems, particularly in their fault tolerance and scalability aspects. I'm co-Editor-in-Chief for ACM Transactions on Computing Systems. I also play jazz guitar and banjo in various bands, including the Ageless Jazz Band, the The JazzHappensBand, and Soda Ash Six. I'm also the designer and webmaster of GigKeeper.com, a site that helps with the management side of playing in bands.

My Current Research

Fast and Scalable Replica Coordination

State machine replication (SMR) is an elegant and powerful abstraction to build reliable services. Unfortunately, it is expensive, an observation that has led to using specialized hardware or to avoiding SMR altogether, performing replica coordination instead at the application layer. We are developing a new SMR variant that pre-assigns sequence numbers to operations based on the observed rates at which they are submitted by different clients; thus, total order is achieved without explicit coordination in the client's critical path. Using a RDMA-based implementation, we are developing fast logs and distributed transactional key-value stores.

Joint work with Yu-Ju Huang, Shubham Chaudhary, Rafael Soares, Shir Cohen Gahtan, Lorenzo Alvisi, and Luis Rodrigues.

Metastable Failures

Metastable failures are a newly identified failure model in complex distributed systems. Sometimes, after some temporary event, the system as a whole performs poorly and does not recover to its original performance. A good example are retry storms: the overhead of recovery is so large that the system cannot recover. We are developing models that can describe metastable failures and hope to use those to build systems that are immune to them.

Joint work with Ali Farakbakhsh, Shir Cohen Gahtan, Lorenzo Alvisi, and Andreas Haeberlen.

Ordered Consensus with Equal Opportunity

The specification of state machine replication (SMR) has no requirement on the final total order of commands. In blockchains based on SMR, however, order matters, since different orders could provide their clients with different financial rewards. Ordered consensus augments the specification of SMR to include specific guarantees on such order, with a focus on limiting the influence of Byzantine nodes. Real-world ordering manipulations, however, can and do happen even without Byzantine replicas, typically because of factors, such as faster networks or closer proximity to the blockchain infrastructure, that give some clients an unfair advantage. To address this challenge, we extend ordered consensus by requiring it to also support equal opportunity, a concrete notion of fairness, widely adopted in social sciences. Informally, equal opportunity requires that two candidates who, according to a set of criteria deemed to be relevant, are equally qualified for a position (in our case, a specific slot in the SMR total order), should have an equal chance of landing it.

Joint work with Yunhao Zhang, Haobin Ni, Soumya Basu, Shir Cohen Gahtan, and Lorenzo Alvisi.

Programmable Memory Segments

The virtual memory of a process, originally intended to abstract physical memory on the machine where the process is running, has been used to abstract a wide variety of resources, including files, distributed shared memory, remote memory resources, persistent memory resources, and more. It is often a convenient interface, as processes can access the resources with CPU-supported load and store machine instructions, allowing applications to run unchanged on new resource types. Moreover, virtual memory can be leveraged to implement transparent checkpointing and recovery, process replication, speculative execution, and debugging tools. Unfortunately, as an abstraction, it is hard for system developers to offer it, as the abstraction is implemented in the kernel of an operating system and therefore hard to customize. The Linux virtual memory interfaces provide only limited customization. We are developing programmable memory segments, a virtual memory abstraction that can be efficiently offered by non-privileged services running in user space. Each programmable memory segment is managed by a segment process. As a result, each application runs as a collection of processes, one running the main logic of the application, and the others managing the various virtual memory segments of the application, each of which can be an important part of the application logic as well.

Joint work with Yacqub Mohamed.

Functional Reasoning for Distributed Systems

We are developing a framework for formal verification of asynchronous and Byzantine fault-tolerant distributed systems consisting of (1) a high-level, synchronous, data-parallel language, Sync, for describing protocols, (2) a denotational semantics that translates systems described in Sync into functions with non-deterministic outputs which can be used for compositional reasoning about safety properties, (3) a standard low-level, asynchronous language, Async, with a compiler that maps Sync programs to Async systems, and (4) an adequacy proof that shows safety properties proved for Sync programs are preserved when the program is compiled to Async. In effect, we alleviate the complexity of asynchronous interleaving by enforcing the statically known control flow of the Sync program. The remaining non-determinism, due to network effects such as message re-ordering, dropped messages, or Byzantine behaviors, is captured by the semantics of the Sync communication primitives.

Joint work with Haobin Ni and Greg Morrisett.

Chain Replication

Fred Schneider and I published the Chain Replication protocol in OSDI 2004. Since then it has been used at many companies and in many research projects. One downside of chain replication is that it it cannot stand on its own and requires a configuration service. It turns out that this is not fundamental: a chain replicated service can reconfigure itself, for example after a failure. The new protocol also addresses certain inefficiencies in the original protocol, such as latency increasing with chain length.

Joint work with Suraaj Kanniwadi Sureshkannan and Fred Schneider.

Educational Projects

Harmony: A Language and Model Checker for Concurrent and Distributed Algorithms

Harmony is a high-performance, easy-to-use model checker for concurrent and distributed algorithms. Harmony enables systems programmers to build and verify a prototype of their systems and transfer the results to production without expertise in specialized logic. It supports both exhaustive model checking of Harmony code and black-box fuzzing of native code written (or generated) in conventional languages. Harmony has a low learning curve -- not only is its surface syntax close to Python, but its virtual machine semantics resembles that of system programming languages. Model-checking is done at the level of virtual machine instructions with built-in specifications for common system properties such as deadlock freedom and data-race freedom. More complex properties can be handled through Harmony's built-in support for behavior checking. Harmony itself is concurrent and introduces novel optimizations, including anonymous threads, dead state elimination, and memoization of both execution and effects. It also supports modular verification that allows different components of a system to be model-checked independently.

Joint work with Haobin Ni.

EmbryOS Operating System

EmbryOS is a new educational operating RISC-V operating system. It is actually a sequence of 12 operating systems, starting from a simple operating system that can essentially just prints "Hello World" to a full-fledged demand-paged multicore OS. Each step in the sequence adds some new functionality in about 200 lines of C code.

EGOS: Earth and Grass Operating System

The Earth and Grass Operating System (EGOS) is a user mode operating system in that it runs inside an ordinary (Linux or Mac OS X) process. Instead of approx. 17M lines of code in an OS like Linux, EGOS proper has approx. 3,000 lines of code, making it relatively easy to wrap your brain around the whole thing. EGOS-2000 is a new operating system intended to run on RISC-V hardware. The entire code base is only 2,000 lines of code, intended to help every student read all the code of a teaching operating system.

Joint work with Yunhao Zhang and Yacqub Mohamed.

Robbert van Renesse