Cornell Systems Lunch

CS 754 Fall 2005
Friday 12PM, Rhodes 655

E. Gün Sirer and Andrew Myers


Sponsored by the Information Assurance Institute (IAI),
Computing and Information Science, Cornell

The Systems Lunch is a seminar for discussing recent, interesting papers in the systems area, broadly defined to span operating systems, distributed systems, networking, architecture, databases, and programming languages. The goal is to foster technical discussions among the Cornell systems research community. We meet once a week on Fridays at noon in Rhodes 655.

The systems lunch is open to all Cornell students interested in systems. First-year graduate students are especially welcome. Student participants are expected to sign up for CS 754, Systems Research Seminar, for one credit.

Date Paper Presenter
August 26 Meridian: A Lightweight Network Location Service without Virtual Coordinates
Bernard Wong, Aleksandrs Slivkins, Emin Gun Sirer
SIGCOMM 2005
Bernard Wong
September 02 Towards a Global IP Anycast Service
Hitesh Ballani, Paul Francis
SIGCOMM 2005
Hitesh Ballani
September 09 FS2: Dynamic Data Replication in Free Disk Space for Improving Disk Performance and Energy-Consumption
Hai Huang, Wanda Hung, Kang Shin (University of Michigan)
SOSP 2005
Hibernator: Helping Disks Sleep Through The Winter
Qingbo Zhu, Zhifeng Chen, Lin Tan, Yuanyuan Zhou (University of Illinois at Urbana-Champaign), Kimberly Keeton, John Wilkes (Hewlett-Packard Laboratory)
SOSP 2005
Kevin Walsh
September 16 Pioneer: Verifying Integrity and Guaranteeing Execution of Code on Legacy Platforms
Arvind Seshadri, Mark Luk, Elaine Shi, Adrian Perrig (CMU), Leendert van Doorn (IBM), Pradeep Khosla (CMU)
SOSP 2005
Labeling Virtual Memory in the Asbestos Operating System
Petros Efstathopoulos (UCLA), Maxwell Krohn (MIT), Steve VanDeBogart (UCLA), Cliff Frey (MIT), David Ziegler (MIT), Eddie Kohler (UCLA), David Mazieres (NYU), M. Frans Kaashoek (MIT CSAIL), Robert T. Morris (MIT CSAIL)
SOSP 2005
Alan Shieh & Dan Williams
September 23 Vigilante: End-to-End Containment of Internet Worms
Manuel Costa (Microsoft Research), Jon Crowcroft (Cambridge University), Miguel Castro, Antony Rowstron, Lidong Zhou, Lintao Zhang and Paul Barham (Microsoft Research)
SOSP 2005
Mahesh Balakrishnan
September 30 Capturing, Indexing, Clustering, and Retrieving System History
Ira Cohen (HP Labs), Moises Goldszmidt (HP Labs), Steve Zhang (Stanford University), Terence Kelly (HP Labs), Armando Fox (Stanford), Julie Symons (HP Labs)
SOSP 2005
Maya Haridasan
October 07 Rx: Treating Bugs As Allergies -- A Safe Method for Surviving Software Failures
Feng Qin, Joseph Tucek, Jagadeesan Sundaresan, Yuanyuan Zhou (University of Illinois at Urbana-Champaign)
SOSP 2005
Lunch will be held in the Systems Lab
Krzys Ostrowski
October 14 A Novel Approach to E-mail Worm/Virus Containment
In this talk, I will present our research into adaptive techniques for identifying and containing the spread of novel e-mail-borne worms and viruses. Traditional techniques have suffered from either a window of vulnerability in detecting novel e-mail worms/viruses or they have high false positive rates, reducing their usefulness.

Our approach combines an Statistical Learning Technique-based approach using greedy e-mail feature selection with semi-supervised learning to yield a system that can detect previously unseen types of e-mail worms and viruses while offering both low false positive and low false negative rates. I will show preliminary results using both a locally collected dataset and a publicly available dataset. I will also discuss the potential benefits that would result from even a partial wide-scale deployment by the top US service providers. Finally, I will present the DETER testbed, a unique national-scale testbed that we have designed and built for open cybersecurity research.

Anthony Joseph (UC Berkeley)
October 21 Checkpointed Early Load Retirement
Nevin Kırman, Meyrem Kırman, Mainak Chaudhuri, José Martínez
HPCA 2005
Nevin Kırman
October 28 Future Execution: A Hardware Prefetching Technique for Chip Multiprocessors
Ilya Ganusov, Martin Burtscher
PACT 2005
Ilya Ganusov
November 04 Responsive Yet Stable Traffic Engineering
Current intra-domain Traffic Engineering (TE) relies on offline methods, which use long term average traffic demands. It cannot react to realtime traffic changes caused by BGP reroutes, diurnal traffic variations, attacks, or flash crowds. Further, current TE deals with network failures by pre-computing alternative routings for a limited set of failures. It may fail to prevent congestion when unanticipated or combination failures occur, even though the network has enough capacity to handle the failure.

This paper presents TeXCP, an online distributed TE protocol that balances load in realtime, responding to actual traffic demands and failures. TeXCP uses multiple paths to deliver demands from an ingress to an egress router, adaptively moving traffic from over-utilized to under-utilized paths. These adaptations are carefully designed such that, though done independently by each edge router based on local information, they balance load in the whole network without oscillations. We model TeXCP, prove the stability of the model, and show that it is easy to implement. Our extensive simulations show that, for the same traffic demands, a network using TeXCP supports the same utilization and failure resilience as a network that uses traditional offline TE, but with half or third the capacity.

Srikanth Kandula (MIT)
November 11 Building and using Hardware-based Trusted Third Parties

Many security protocols hypothesize the existence of a trusted third party (TTP) to ease handling of computation and data too sensitive for the other parties involved. However, using a TTP to solve real-world security problems generates the same reaction as using fairies or magic: TTPs do not really exist.

This talk will present my research and development work building TTPs based on hardware techniques that (to various degrees of assurance) can help ensure that devices can carry out computation unmolested, and using such hardware-based TTPs to solve real-world problems.

Bio: Sean's current research at Dartmouth College focuses on how to build trustworthy systems in the real world. He previously worked as a scientist at IBM T.J. Watson Research Center, doing secure coprocessor design, implementation and validation, and at Los Alamos National Laboratory, doing security designs and analyses for a wide range of primarily public-sector clients. His book _Trusted Computing Platforms: Design and Applications_ (Springer, 2005) provides a deeper presentation of this research journey.

Sean was educated at Princeton (B.A., Math) and CMU (M.S., Ph.D., Computer Science).

Sean Smith (Dartmouth)
November 18 Characterization and Measurement of TCP Traversal through NATs and Firewalls
Saikat Guha and Paul Francis
IMC 2005
Saikat
November 25 Happy Thanksgiving, no meeting.
December 02 Perils of Transitive Trust in the Domain Name System
Venugopalan Ramasubramanian and Emin Gun Sirer
IMC 2005
Lunch will be held in the Systems Lab
Rama
December 07 Program-Counter-Based Prediction Techniques in Operating Systems

Program instructions uniquely identified by their program counters (PCs) provide a convenient and accurate means of recording the context of program execution and PC-based prediction techniques have been widely used for performance optimizations at the architectural level. Operating systems, on the other hand, have not fully explored the benefits of PC-based prediction for resource management. This work explores the potential benefits provided by PC-based prediction in operating systems (PCOS). In particular, we investigate the potential of using PC-based prediction techniques for managing I/O devices in operating systems.

As a first demonstration of PCOS, we developed a PC-based access pattern classification technique (PCC) for buffer cache management. PCC allows the operating system to correlate the I/O operations with the program context in which they are issued via the PCs of the call instructions that trigger the I/O requests. This correlation allows the operating system to classify I/O access pattern on a per-call-site basis which achieves significantly better accuracy than previous per-file or per-application classification techniques.

We have also developed a PC-based technique (PCAP) for power management that dynamically learns the application I/O access patterns and associated disk idle times to predict when an I/O device can be shut down to save energy. PCAP uses path-based correlation to observe a particular sequence of I/O triggering instructions leading to each idle period, and accurately predicts future occurrences of that idle period.
The talk will take place in Upson 5130, 2-3pm

Charlie Hu (Purdue)
December 09 Antiquity: Efficiently Binding Data to Owners in Distributed Content-Addressable Storage Systems
In content-addressable storage (CAS) systems, data is addressed not by its physical location but by a name that is derived from the content of that data. In recent years, the CAS interface, similar to the hashtable's put/get interface, has proven to be a solid foundation upon which to build wide-area distributed storage systems (e.g. CFS, PAST, Pond, Venti).

Distributed storage systems derive several favorable properties from the CAS interface. First, a CAS interface helps ensure data integrity. By carefully selecting the method used to derive names from data, clients can validate the integrity of data retrieved from the system against the name by which it was accessed. With self-verifying data, clients can detect data altered by faulty or compromised components and re-fetch from alternate sources. Second, a CAS interface promotes system scalability. Because the interface does not expose physical addresses to applications, the system can replicate and transfer data freely to add hardware resources or upgrade internal protocols.

One feature commonly missing from distributed content-addressable storage systems, however, is the ability to determine the owner of data stored in the system. Identifying the owner of a piece of data is critical for any system that wishes to monitor per-user storage consumption or compute usage-based fees. In this presentation, we consider how to efficiently implement this feature---the ability to identify the owner of each piece of data---in distributed CAS systems.

While a solution that associates a certificate with each block of data is conceptually simple, it has been traditionally claimed that the cost of creating and maintaining certificates is too great. In this presentation, we demonstrate that systems can, in fact, efficiently map data to its owner in a secure and non-repudiable fashion. To reduce the cost of creating and maintaining certificates, we extend the traditional content-addressable interface to allow the aggregation of many small data blocks into larger containers. The aggregation is performed in a way that also supports self-verifying data at the granularity of the block and container, fine-granularity access, and incremental updates. We describe a prototype implementation called Antiquity and present performance results from deployments on PlanetLab and a local cluster.

Hakim Weatherspoon (UC Berkeley)