The Systems group at Cornell examines the design and
implementation of the fundamental software systems that form
our computing infrastructure.
Below we give just a small representation of the varied systems work
going on here, and invite you to visit the project and faculty web pages,
as well as read the papers.
Also check out our weekly
Systems Research Seminar.
Networking
Ken Birman, Robbert van Renesse, and Hakim Weatherspoon are working with
industry to develop new fault-tolerance and scalability options for the
world's fastest network routers. This effort is exploring ways of
migrating some of their prior work on distributed computing into the
router itself, so that applications such as BGP can scale better and
handle component failures gracefully, and new kinds of functionality
like overlay networks or VOIP can be executed right on the router
fault-tolerantly. A goal is to compete with telephony levels of
availability: the router must be “continuously on”
even when handling a crashed node.
Weatherspoon is also looking at packet loss dynamics on the
National LambdaRail,
an optical network spanning the country.
Andrew Myers and Emin Gun Sirer developed Trickles
(NSDI 2005, ACM TOCS 2008),
a stateless network protocol stack.
Traditional operating system interfaces and network protocol
implementations force system state to be kept on
both sides of a connection. Such state ties the connection
to an endpoint, impedes transparent failover, permits
denial-of-service attacks, and limits scalability. Trickles
is a novel TCP-like transport protocol
and a new interface to replace sockets that together enable
all state to be kept on one endpoint, allowing the
other endpoint, typically the server, to operate without
any per-connection state.
| | |
Peer-to-Peer Systems and Collaborative Tools
Cornell faculty has done extensive work in the Peer-to-Peer networking area,
ranging from file sharing to media streaming to network monitoring.
Van Renesse and Birman developed the highly scalable Astrolabe
network monitoring system (IPTPS 2002, ACM TOCS 2003), now used at a
major e-retailer. Emin Gun Sirer has created a large number of P2P
systems, including
Antfarm (NSDI 2009), a content distribution system based on managed swarms,
Octant (NSDI 2007), a system for geolocation of
Internet hosts, and Corona, an Internet-scale Publish-Subscribe
system (NSDI 2006). Van Renesse developed Fireflies, a Byzantine-tolerant
P2P overlay network (Eurosys 2006). Weatherspoon
designed and implemented the Antiquity system, a secure P2P storage
facility (Eurosys 2007).
Birman's group is working on Web2.0 collaboration.
This effort is looking at the challenges of using
Web 2.0 technologies (mashups) in support of demanding collaboration
applications, such as one sees in the military or in hospitals. All
sorts of hard security, privacy and storage issues arise, and they're
studying how best to solve them and how to scale the solutions up for
really wide adoption. As part of this, they've built a mashup technology
of their own, Live Objects
(ECOOP 2008, Middleware 2008),
but the hope is to end up with
technology that would also apply to Google's Wave, Microsoft's
Silverlight or other mashup solutions. Have a look at the
demo.
| | |
Distributed Systems and Fault Tolerance
Cornell is particularly well-known for its foundational and practical
work on fault-tolerant
distributed systems. Fred Schneider's oft-referenced State Machine
Replication tutorial is standard fare in systems courses around the
world (ACM Computing Surveys 1990). Van Renesse and Schneider
formalized and analyzed the Chain Replication paradigm (OSDI 2004).
Ken Birman's ISIS system (SOSP 1985, SOSP 1987) has been extensively
used in industry for building fault-tolerant systems.
Birman and Van Renesse subsequently built fault-tolerant middlewares
include Horus (Comm. ACM 1996) and
Ensemble (SOSP 1999).
Currently,
Van Renesse is investigating various aspects of tolerating
Byzantine failures.
For example, Bosco (DISC 2008) is a Byzantine consensus protocol that decides in one round
under favorable conditions.
Van Renesse and Schneider are investigating building robust
distributed systems based on stepwise refinement.
Nysiad (NSDI 2008) is a system that
implements a new technique for transforming, through stepwise refinement,
a scalable distributed system or network protocol tolerant only of
crash failures into one that tolerates arbitrary failures,
including such failures as freeloading and malicious attacks.
| | |
Operating Systems
Research on operating system kernels is less common than it used to be,
but at Cornell we not only have been, but are still, highly active in this area.
Hakim Weatherspoon is currently working on
multi-core extensions to the Linux operating system, as well as
file system mirroring across high bandwidth, high latency links
(FAST 2009).
Fred Schneider built a replicated UNIX system
using virtual machine technology (SOSP 1995, ACM TOCS 1996).
Emin Gun Sirer was an active participant in the design and
implementation of the SPIN extensible operating system (SOSP 1995).
Currently, Schneider and Sirer have joined forces on
Nexus
(OSDI 2008),
a new operating system that exploits secure hardware to enable
novel features not found in existing operating systems.
Device drivers typically execute in supervisor mode and
thus must be fully trusted. In Nexus drivers are
moved out of the trusted computing base, running
them without supervisor privileges and constraining
their interactions with hardware devices.
| | |
Energy-Aware Computing
A relatively new effort at Cornell is Energy Aware Systems.
In the area of low-power sensor networking, Van Renesse
has looked a power-aware epidemic protocols (SRDS 2002).
Hakim Weatherspoon and his students are looking
at designing datacenter storage systems that are frugal with energy use.
The KyotoFS file system (HotOS 2007) is a log-structured file system.
Using multiple disks, only the disk that stores the head of the log
has to be spinning most of the time, leading to significant
energy savings.
| | |
Cross-Cutting Research Areas
Besides the topics mentioned above, the systems faculty is also
actively involved with cross-cutting technology such as
Security,
Programming Languages,
Computer Architecture,
and even
Theory.
Click on these links and explore further.
|
|
Researchers
Ken Birman
Distributed computing, fault-tolerant network systems, distributed systems security, large-scale network applications.
Andrew Myers
Programming languages, security, mobile code, persistent and distributed objects.
Fred B. Schneider
Distributed systems security and fault-tolerance, mobile code, concurrent programming, secure OS.
Emin Gun Sirer
Operating system support for ad hoc networks, peer-to-peer systems, self-organizing overlays, networked services and extensible systems, secure OS.
Robbert van Renesse
Distributed computing, peer-to-peer networking, scalability, fault tolerance, adaptive networking.
Hakim Weatherspoon
Distributed computing, large scale storage systems, energy-aware computing, operating systems.
Related Links
Architecture
Programming Languages
Security
Systems Lunch Seminar
CSL
|