Cornell Systems Lunch

CS 754 Fall 2004
Friday 12PM, Rhodes 655

E. Gün Sirer and Andrew Myers

Sponsored by the Information Assurance Institute (IAI),
Computing and Information Science, Cornell

The Systems Lunch is a seminar for discussing recent, interesting papers in the systems area, broadly defined to span operating systems, distributed systems, networking, architecture, databases, and programming languages. The goal is to foster technical discussions among the Cornell systems research community. This fall, the Systems Lunch will focus on interesting papers from the upcoming OSDI, SIGCOMM and recent Oakland conferences. We will meet once a week on Fridays at noon in Rhodes 655.

The systems lunch is open to all Cornell students interested in systems. First-year graduate students are especially welcome. Student participants are expected to sign up for CS 754, Systems Research Seminar, for one credit.

Past semesters:

Spring 04
Fall 03
Spring 03
Fall 02
Spring 02
Fall 01

Date	Paper	Presenter
August 27	GeoPeer: A Location-Aware Peer-to-Peer System Araujo and Rodrigues U of Lisbon Tech report PALM: Predicting Internet Network Distances Using Peer-to-Peer Measurements Lehman and Lerman MIT Tech Report Background Reading: Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris. SIGCOMM 2004	Bernard Wong
Sep 3	Araneola: A Scalable Reliable Multicast System for Dynamic Environments Roie Melamed and Idit Keidar NCA 2004	Roie Melamed (Technion)
Sep 10	An Indexing Framework for Structured P2P Systems	Prakash Linga
Sep 17	On-the-Fly Verification of Rateless Erasure Codes for Efficient Content Distribution Maxwell N. Krohn, Michael J. Freedman, David Mazieres Oakland 2004 Accessing Multiple Mirror Sites in Parallel: Using Tornado Codes to Speed Up Downloads John W. Byers, Michael Luby, and Michael Mitzenmacher	Kevin Walsh
Sep 24	Securing OLAP Data Cubes Against Privacy Breaches Lingyu Wang, Sushil Jajodia, Duminda Wijesekera Oakland 2004	Nazrul Alam
Oct 1	Boxwood: Abstractions as the Foundation for Storage Infrastructure Lidong Zhou, Microsoft Research Silicon Valley Writers of complex storage applications like distributed file systems and databases are faced with the challenges of building complex abstractions over simple storage devices like disks. These challenges are exacerbated due to the additional requirements for fault-tolerance and scaling. Our research explores the premise that high-level, fault-tolerant abstractions supported directly by the storage infrastructure can ameliorate these problems. We have built a system called Boxwood to explore the feasibility and utility of providing high-level abstractions or data structures as the fundamental storage infrastructure. Boxwood currently runs on a small cluster of eight machines. The Boxwood abstractions perform very close to the limits imposed by the processor, disk, and the native networking subsystem. Using these abstractions directly, we have implemented an NFS v2 file service that demonstrates the promise of our approach.	Lidong Zhou (MSR, Silicon Valley)
Oct 8	Application-level Checkpointing for Shared Memory Programs Greg Bronevetsky, Martin Schulz, Peter Szwed, Daniel Marques, Keshav Pingali ASPLOS 2004	Greg Bronevetsky
Oct 15	Large-Scale IP Traceback in High-Speed Internet: Practical Techniques and Theoretical Foundation Jun Li, Minho Sung, Jun (Jim) Xu, Li (Erran) Li Oakland 2004	Hitesh Ballani
Oct 22	MPAT: Aggregate TCP Congestion Management as a Building Block for Internet QoS Manpreet Singh, Prashant Pradhan and Paul Francis IEEE International Conference on Network Protocols (ICNP 2004)	Manpreet Singh
Oct 29	The Digital Distributed System Security Architecture Gasser, M., Goldstein, A., Kaufman, C., and Lampson, B. (1989). National Computer Security Conference 1989. SWATT: SoftWare-based ATTestation for Embedded Devices Arvind Seshadri, Adrian Perrig, Leendert van Doorn, Pradeep Khosla Oakland 2004	Dan Williams
Nov 5	Recovery as Rapid Adaptation: Combining Fast Microrecovery with Statistical Monitoring We began the Recovery-Oriented Computing (ROC) project with the goal of increasing Internet server availability by reducing time to recovery. Building on the observation that rebooting or restarting is a well-known and simple form of recovery that returns systems or subsystems to a "clean slate", we proposed to design systems specifically so that the only shutdown method is crashing and the only recovery method is fast reboot; we called this approach crash-only software. Having designed three crash-only systems, we find that cheap recovery, while indeed good for its own sake in improving availability, also enables "micro-recovery" as a first line of defense: rather than complex error unwinding, coerce any observed error to a (micro-)crash, then (micro-)recover. If micro-recovery is sufficiently cheap in performance and does not impact correctness, there's no reason to avoid trying it first, even if it does not always solve the problem. This in turn enables the use of automated aggressive detection techniques that have nontrivial false positive rates, or equivalently, to deploy multiple overlapping detectors/alarms in order to be conservative. Fast cheap micro-recovery also allows more liberal use of rejuvenation, such as so-called "rolling reboots", without worrying about when is the "best" time to do it. We have also found that cheap recovery also allows some maintenance operations such as incremental scaling of storage to be recast as failure plus recovery, exploiting the same mechanisms as recovery to achieve online scaling without service interruption. In this talk I'll describe highlights and design lessons from three crash-only systems we've built, including experiments using statistical anomaly detection techniques (with nontrivial false positive rates) as a complementary monitoring strategy. I'll also discuss how this approach might provide a scientific basis for designing tolerant applications in the face of imperfect detection and localization techniques.	Armando Fox (Stanford)
Nov 12	Diagnosing Network-Wide Traffic Anomalies Lakhina, Crovella, Diot SIGCOMM 2004	Joy Zhang
Nov 19	No meeting, ACSU Luncheon.
Nov 26	Happy Thanksgiving.
Dec 3	The Design and Implementation of a Next Generation Name Service for the Internet. Venugopalan Ramasubramanian, Emin Gun Sirer SIGCOMM 2004 The Systems Lunch will be held in the Systems Lab, not Rhodes 655!	Rama