Cornell Systems Lunch

CS 754 Fall 2004
Friday 12PM, Rhodes 655

E. Gün Sirer and Andrew Myers


Sponsored by the Information Assurance Institute (IAI),
Computing and Information Science, Cornell

The Systems Lunch is a seminar for discussing recent, interesting papers in the systems area, broadly defined to span operating systems, distributed systems, networking, architecture, databases, and programming languages. The goal is to foster technical discussions among the Cornell systems research community. This fall, the Systems Lunch will focus on interesting papers from the upcoming OSDI, SIGCOMM and recent Oakland conferences. We will meet once a week on Fridays at noon in Rhodes 655.

The systems lunch is open to all Cornell students interested in systems. First-year graduate students are especially welcome. Student participants are expected to sign up for CS 754, Systems Research Seminar, for one credit.

Past semesters:

Spring 04
Fall 03
Spring 03
Fall 02
Spring 02
Fall 01

Date Paper Presenter
August 27 GeoPeer: A Location-Aware Peer-to-Peer System
Araujo and Rodrigues
U of Lisbon Tech report
PALM: Predicting Internet Network Distances Using Peer-to-Peer Measurements
Lehman and Lerman
MIT Tech Report
Background Reading:
Vivaldi: A Decentralized Network Coordinate System
Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris.
SIGCOMM 2004
Bernard Wong
Sep 3 Araneola: A Scalable Reliable Multicast System for Dynamic Environments
Roie Melamed and Idit Keidar
NCA 2004
Roie Melamed (Technion)
Sep 10 An Indexing Framework for Structured P2P Systems Prakash Linga
Sep 17 On-the-Fly Verification of Rateless Erasure Codes for Efficient Content Distribution
Maxwell N. Krohn, Michael J. Freedman, David Mazieres
Oakland 2004

Accessing Multiple Mirror Sites in Parallel: Using Tornado Codes to Speed Up Downloads
John W. Byers, Michael Luby, and Michael Mitzenmacher
Kevin Walsh
Sep 24 Securing OLAP Data Cubes Against Privacy Breaches
Lingyu Wang, Sushil Jajodia, Duminda Wijesekera
Oakland 2004
Nazrul Alam
Oct 1 Boxwood: Abstractions as the Foundation for Storage Infrastructure
Lidong Zhou, Microsoft Research Silicon Valley
Writers of complex storage applications like distributed file systems and databases are faced with the challenges of building complex abstractions over simple storage devices like disks. These challenges are exacerbated due to the additional requirements for fault-tolerance and scaling. Our research explores the premise that high-level, fault-tolerant abstractions supported directly by the storage infrastructure can ameliorate these problems. We have built a system called Boxwood to explore the feasibility and utility of providing high-level abstractions or data structures as the fundamental storage infrastructure. Boxwood currently runs on a small cluster of eight machines. The Boxwood abstractions perform very close to the limits imposed by the processor, disk, and the native networking subsystem. Using these abstractions directly, we have implemented an NFS v2 file service that demonstrates the promise of our approach.
Lidong Zhou (MSR, Silicon Valley)
Oct 8 Application-level Checkpointing for Shared Memory Programs
Greg Bronevetsky, Martin Schulz, Peter Szwed, Daniel Marques, Keshav Pingali
ASPLOS 2004
Greg Bronevetsky
Oct 15 Large-Scale IP Traceback in High-Speed Internet: Practical Techniques and Theoretical Foundation
Jun Li, Minho Sung, Jun (Jim) Xu, Li (Erran) Li
Oakland 2004
Hitesh Ballani
Oct 22 MPAT: Aggregate TCP Congestion Management as a Building Block for Internet QoS
Manpreet Singh, Prashant Pradhan and Paul Francis
IEEE International Conference on Network Protocols (ICNP 2004)
Manpreet Singh
Oct 29 The Digital Distributed System Security Architecture
Gasser, M., Goldstein, A., Kaufman, C., and Lampson, B. (1989).
National Computer Security Conference 1989.

SWATT: SoftWare-based ATTestation for Embedded Devices
Arvind Seshadri, Adrian Perrig, Leendert van Doorn, Pradeep Khosla
Oakland 2004
Dan Williams
Nov 5 Recovery as Rapid Adaptation: Combining Fast Microrecovery with Statistical Monitoring
We began the Recovery-Oriented Computing (ROC) project with the goal of increasing Internet server availability by reducing time to recovery. Building on the observation that rebooting or restarting is a well-known and simple form of recovery that returns systems or subsystems to a "clean slate", we proposed to design systems specifically so that the only shutdown method is crashing and the only recovery method is fast reboot; we called this approach crash-only software. Having designed three crash-only systems, we find that cheap recovery, while indeed good for its own sake in improving availability, also enables "micro-recovery" as a first line of defense: rather than complex error unwinding, coerce any observed error to a (micro-)crash, then (micro-)recover. If micro-recovery is sufficiently cheap in performance and does not impact correctness, there's no reason to avoid trying it first, even if it does not always solve the problem. This in turn enables the use of automated aggressive detection techniques that have nontrivial false positive rates, or equivalently, to deploy multiple overlapping detectors/alarms in order to be conservative. Fast cheap micro-recovery also allows more liberal use of rejuvenation, such as so-called "rolling reboots", without worrying about when is the "best" time to do it. We have also found that cheap recovery also allows some maintenance operations such as incremental scaling of storage to be recast as failure plus recovery, exploiting the same mechanisms as recovery to achieve online scaling without service interruption.
In this talk I'll describe highlights and design lessons from three crash-only systems we've built, including experiments using statistical anomaly detection techniques (with nontrivial false positive rates) as a complementary monitoring strategy. I'll also discuss how this approach might provide a scientific basis for designing tolerant applications in the face of imperfect detection and localization techniques.
Armando Fox (Stanford)
Nov 12 Diagnosing Network-Wide Traffic Anomalies
Lakhina, Crovella, Diot
SIGCOMM 2004
Joy Zhang
Nov 19 No meeting, ACSU Luncheon.
Nov 26 Happy Thanksgiving.
Dec 3 The Design and Implementation of a Next Generation Name Service for the Internet.
Venugopalan Ramasubramanian, Emin Gun Sirer
SIGCOMM 2004
The Systems Lunch will be held in the Systems Lab, not Rhodes 655!
Rama