Cornell Systems Lunch
CS 754 Fall 2004
Sponsored by the
Information Assurance Institute (IAI),
The Systems Lunch is a seminar for discussing recent, interesting papers in the systems area, broadly defined to span operating systems, distributed systems, networking, architecture, databases, and programming languages. The goal is to foster technical discussions among the Cornell systems research community. This fall, the Systems Lunch will focus on interesting papers from the upcoming OSDI, SIGCOMM and recent Oakland conferences. We will meet once a week on Fridays at noon in Rhodes 655.
The systems lunch is open to all Cornell students interested in systems. First-year graduate students are especially welcome. Student participants are expected to sign up for CS 754, Systems Research Seminar, for one credit.
GeoPeer: A Location-Aware Peer-to-Peer System
Araujo and Rodrigues
U of Lisbon Tech report
PALM: Predicting Internet Network Distances Using Peer-to-Peer Measurements
Lehman and Lerman
MIT Tech Report
Vivaldi: A Decentralized Network Coordinate System
Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris.
Araneola: A Scalable Reliable Multicast System for Dynamic Environments
Roie Melamed and Idit Keidar
|Roie Melamed (Technion)|
|Sep 10||An Indexing Framework for Structured P2P Systems||Prakash Linga|
On-the-Fly Verification of Rateless Erasure Codes for Efficient Content Distribution
Maxwell N. Krohn, Michael J. Freedman, David Mazieres
Accessing Multiple Mirror Sites in Parallel: Using Tornado Codes to Speed Up Downloads
John W. Byers, Michael Luby, and Michael Mitzenmacher
Securing OLAP Data Cubes Against Privacy Breaches
Lingyu Wang, Sushil Jajodia, Duminda Wijesekera
Boxwood: Abstractions as the Foundation for Storage Infrastructure
Lidong Zhou, Microsoft Research Silicon Valley
Writers of complex storage applications like distributed file systems and databases are faced with the challenges of building complex abstractions over simple storage devices like disks. These challenges are exacerbated due to the additional requirements for fault-tolerance and scaling. Our research explores the premise that high-level, fault-tolerant abstractions supported directly by the storage infrastructure can ameliorate these problems. We have built a system called Boxwood to explore the feasibility and utility of providing high-level abstractions or data structures as the fundamental storage infrastructure. Boxwood currently runs on a small cluster of eight machines. The Boxwood abstractions perform very close to the limits imposed by the processor, disk, and the native networking subsystem. Using these abstractions directly, we have implemented an NFS v2 file service that demonstrates the promise of our approach.
|Lidong Zhou (MSR, Silicon Valley)|
Application-level Checkpointing for Shared Memory Programs
Greg Bronevetsky, Martin Schulz, Peter Szwed, Daniel Marques, Keshav Pingali
Large-Scale IP Traceback in High-Speed Internet: Practical Techniques and Theoretical Foundation
Jun Li, Minho Sung, Jun (Jim) Xu, Li (Erran) Li
MPAT: Aggregate TCP Congestion Management as a Building Block for Internet QoS
Manpreet Singh, Prashant Pradhan and Paul Francis
IEEE International Conference on Network Protocols (ICNP 2004)
The Digital Distributed System Security Architecture
Gasser, M., Goldstein, A., Kaufman, C., and Lampson, B. (1989).
National Computer Security Conference 1989.
SWATT: SoftWare-based ATTestation for Embedded Devices
Arvind Seshadri, Adrian Perrig, Leendert van Doorn, Pradeep Khosla
Recovery as Rapid Adaptation: Combining Fast Microrecovery with Statistical Monitoring
We began the Recovery-Oriented Computing (ROC) project with the goal of increasing Internet server availability by reducing time to recovery. Building on the observation that rebooting or restarting is a well-known and simple form of recovery that returns systems or subsystems to a "clean slate", we proposed to design systems specifically so that the only shutdown method is crashing and the only recovery method is fast reboot; we called this approach crash-only software. Having designed three crash-only systems, we find that cheap recovery, while indeed good for its own sake in improving availability, also enables "micro-recovery" as a first line of defense: rather than complex error unwinding, coerce any observed error to a (micro-)crash, then (micro-)recover. If micro-recovery is sufficiently cheap in performance and does not impact correctness, there's no reason to avoid trying it first, even if it does not always solve the problem. This in turn enables the use of automated aggressive detection techniques that have nontrivial false positive rates, or equivalently, to deploy multiple overlapping detectors/alarms in order to be conservative. Fast cheap micro-recovery also allows more liberal use of rejuvenation, such as so-called "rolling reboots", without worrying about when is the "best" time to do it. We have also found that cheap recovery also allows some maintenance operations such as incremental scaling of storage to be recast as failure plus recovery, exploiting the same mechanisms as recovery to achieve online scaling without service interruption.
In this talk I'll describe highlights and design lessons from three crash-only systems we've built, including experiments using statistical anomaly detection techniques (with nontrivial false positive rates) as a complementary monitoring strategy. I'll also discuss how this approach might provide a scientific basis for designing tolerant applications in the face of imperfect detection and localization techniques.
|Armando Fox (Stanford)|
Diagnosing Network-Wide Traffic Anomalies
Lakhina, Crovella, Diot
|Nov 19||No meeting, ACSU Luncheon.|
|Nov 26||Happy Thanksgiving.|
The Design and Implementation of a Next Generation Name Service for the Internet.
Venugopalan Ramasubramanian, Emin Gun Sirer
The Systems Lunch will be held in the Systems Lab, not Rhodes 655!