Department of Computer Science 

CS 6410: Advanced Systems

Fall 2012

* Home
* Schedule
* Labs
* Project

Note that papers are subject to change


Who (click for slides)



Required Reading

Suggested Reading

1 Ken
(pdf pptx)
8/23 Course overview
2 Ken
(pdf pptx)
8/28 Building Large, Principled Systems End-to-end arguments in system design, J.H. Saltzer, D.P. Reed, D.D. Clark. ACM Transactions on Computer Systems Volume 2, Issue 4 (November 1984), pages 277--288.

Hints for computer system design, B. Lampson. Proceedings of the Ninth ACM Symposium on Operating Systems Principles (Bretton Woods, New Hampshire, United States) 1983, pages 33--48.

The Impact of Architectural Trends on Operating System Performance Rosenblum et al. 15th SOSP, 1995.

Interposition Agents: Transparently Interposing User Code at the System Interface, Michael Jones. 14th SOSP, 1993, pages 80--93.

3 Ken
(pdf pptx)
8/30 Classic Systems The UNIX time-sharing system, Dennis M. Ritchie and Ken Thompson. Communications of the ACM Volume 17, Issue 7 (July 1974), pages 365--375.

The Duality of Memory and Communication in the Implementation of a Multiprocessor Operating System, M. Young, A Tavanian, R. Rashid, D. Golub, and J. Eppinger. Proceedings of the Eleventh ACM Symposium on Operating Systems Principles (Austin, Texas, United States), ACM, 1987, pages 63--76.
Using continuations to implement thread management and communication in operating systems, Richard P. Draves, Brian N. Bershad, Richard F. Rashid, and Randall W. Dean. Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles (Pacific Grove, California, 1991), pages 122--136.

The nucleus of a multiprogramming system, P, Brinch Hansen. Communications of the ACM Volume 13, Issue 4 (April 1970), pages 238--241.
4 Colin
9/4 Modern Systems: Multicore issues The Multikernel: A new OS architecture for scalable multicore systems.  Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harrisy, Rebecca Isaacs,
Simon Peter , Tim Roscoe, Adrian Schüpbach, and Akhilesh Singhania . Proceedings of the Twenty-Second ACM Symposium on Operating Systems Principles (Austin, Texas, United States), ACM, 2009.

Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system, Ben Gamsa, Orran Krieger, Jonathan Appavoo, and Michael Stumm. 3rd USENIX symposium on Operating systems design and implementation (OSDI), February 1999, pages 87-100.
Thousand core chips: a technology perspective. S. Borkar.  In Proceedings of the 44th Annual Design Automation Conference, pages.  746–749, 2007.

Corey: An operating system for many cores. S. Boyd-Wickizer, H. Chen, R. Chen, Y. Mao, F. Kaashoek, R. Morris, A. Pesterev, L. Stein, M. Wu, Y. Dai, Y. Zhang, and Z. Zhang. In Proceedings of the 8th
USENIX Symposium on Operating Systems Design and Implementation,
pages 43–57, Dec. 2008.
5 Ken
(pptx, pdf)
9/6 Modern Systems: Virtualization Disco: Running Commodity Operating Systems on Scalable Multiprocessors, Edouard Bugnion, Scott Devine, and Mendel Rosenblum. 16th ACM symposium on Operating systems principles (SOSP), October 1997, pages 143--156.

Xen and the Art of Virtualization, Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield. 19th ACM symposium on Operating systems principles (SOSP), October 2003, pages 164--177.
Are Virtual Machine Monitors Microkernels Done Right?, Steven Hand, Andrew Warfield, Keir Fraser, Evangelos Kotsovinos, Dan Magenheimer. Proceedings of the Tenth Workshop on Hot Topics in Operating Systems (HotOS), Sante Fe, NM, June 2005.

Are Virtual Machine Monitors Microkernels Done Right?, Gernot Heiser, Volkmar Uhlig, Joshua LeVasseur. ACM SIGOPS Operating Systems Review (OSR), Volume 40, Issue 1, January 2006, pages 95--99.

Memory resource management in VMware ESX server, C. A. Waldspurger. OSDI 2002.

Virtual Machine Monitors: Current Technology and Future Trends, Mendel Rosenblum, Tal Garfinkel. Computer, vol. 38, no. 5, pp. 39-47, May 2005.

Operating System Support for Virtual Machines, S. T. King, G. W. Dunlap, and P. M. Chen. 2003 USENIX Technical Conference.

(Potemkin, Vigilante, Ken)
9/11 Modern Systems: Containment Scalability, Fidelity, and Containment in the Potemkin Virtual Honeyfarm. Michael Vrable, Justin Ma, Jay Chen, David Moore, Erik Vandekieft,
Alex C. Snoeren, Geoffrey M. Voelker, and Stefan Savage.  Proceedings of the Twentyth ACM Symposium on Operating Systems Principles (SOSP), Brighton, UK. 2005.


Vigilante: End-to-End Containment of Internet Worm Epidemics. Manuel Costa, Jon Crowcroft, Miguel Castro, Antony Rowstron, Lidong Zhou, Lintao Zhang, and Paul Barham, in ACM Transactions on Computer Systems, December 2008

Labels and Event Processes in the Asbestos Operating System, Petros Efstathopoulos, Maxwell Krohn, Steve VanDeBogart, Cliff Frey, David Ziegler, Eddie Kohler, David Mazières, Frans Kaashoek, and Robbert Morris. 12th ACM symposium on Operating systems principles (SOSP), October 2005, pages 17--30.

Intrusion Recovery Using Selective Re-execution.
Taesoo Kim, Xi Wang, Nickolai Zeldovich, and M. Frans Kaashoek.  OSDI 2010.

Symantic Elderwood project (analysis of current malware attacks)
7 Ken
(pdf, pptx)
9/13 Modern Systems: Rollback Time warp operating system. D. Jefferson, B. Beckman, F. Wieland, L. Blume, and M. Diloreto. 1987.  SIGOPS Oper. Syst. Rev. 21, 5 (November 1987), 77-93.  

Message Logging: Pessimistic, Optimistic, Causal and Optimal.  Lorenzo Alvisi, Keith Marzullo. IEEE Transactions on Software Engineering . 24:2, February 1998, pp. 149-159.
 Checkpointing and Rollback-Recovery for Distributed Systems. Richard Koo, Sam Toueg. FJCC 1986: 1150-1158
8 Ken
(pdf, pptx)
9/18 Modern Systems: Obsfuscation The Monoculture Risk Put into Context. Fred B. Schneider and Ken Birman. IEEE Security & Privacy. Volume 7, Number 1. Pages 14-17. January/February 2009.

Why Do Internet Services Fail, and What Can Be
Done About It?
D. Oppenheimer, A. Ganapathi, 1. and D.A. Patterson,  Proc. 4th Usenix Symp. Internet Technologies
and Systems, Usenix Assoc., 2003, pp. 1–16.

On the Effectiveness of Address-Space Randomization.  Shacham, H. and Page, M. and Pfaff, B. and Goh, E.J. and Modadugu, N. and Boneh, D, Proceedings of the 11th ACM conference on Computer and communications security,pp 298—307, 2004
Address Obfuscation: An Efficient Approach to Combat a Broad Range of Memory Error Exploits. S. Bhatkar, D.C. DuVarney, and R. Sekar, Proc. 12th Usenix
Security Symp., Usenix Assoc., 2003, pp. 105–120.

Randomized Instruction Set
E.G. Barrantes et al., ACM Trans. Information and System Security, vol. 8, no. 1, 2005, pp. 3–40.
9 Ken
(pdf, pptx)
9/20 Modern Systems: Security Controlling Dynamic Guests in a Virtual Computing Utility. J. Chase, I. Constandache, A. Demberel, L. Grit, V. Marupadi, M. Sayler, A. Yumerefendi. International Conference on the Virtual Computing Initiative (ICVCI 2008), May 2008.

Managing Identity and Authorization for Community Clouds.
Je Chase, Prateek Jaipuria, Steve Schwab and Ted Faber.  August 17, 2012
Design of a Role-Based Trust Management Framework. Ninghui Li and John C. Mitchell and William H. Winsborough. Proceedings of the 2002 IEEE Symposium on Security and Privacy, IEEE Computer Society Press, May 2004, pages 114-130.
10 Tom 9/25 Modern Systems: Security

Note: Ken out of town

Fabric: A Platform for Secure Distributed Computation and Storage.  Jed Liu, Michael D. George, K. Vikram, Xin Qi, Lucas Waye, and Andrew C. Myers.  In ACM Symposium on Operating Systems, 2009.  Pages 321-334.


Logical Attestation: An Authorization Architecture For Trustworthy Computing. Emin Gün Sirer, Willem de Bruijn, Patrick Reynolds, Alan Shieh, Kevin Walsh, Dan Williams, and Fred B. Schneider. In Proceedings of the Symposium on Operating Systems Principles, Cascais, Portugal, October 2011.

Wiki: Trusted Platform Module

Wiki: Public Key Certificates

Wiki: X509 Standard
11 Paul/Qi 9/27 Cloud-Scale Storage The Google file system, Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung. 19th ACM symposium on Operating systems principles (SOSP), October 2003, 29--43.

Finding a Needle in Haystack: Facebook's Photo Storage.
Doug Beaver, Sanjeev Kumar, Harry C. Li, Jason Sobel, and Peter Vajgel.  OSDI 2010.
Rethink the Sync, Edmund B. Nightingale, Kaushik Veeraraghavan, Peter M. Chen, and Jason Flinn. Proceedings of the 7th USENIXE Symposium on Operating Systems Design and Implementation (OSDI), November 2006.

Speculative Execution in a Distributed File System, Edmund B. Nightingale, Peter M Chen, Jason Flinn. Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP), October 2005, pages 191--205.
12 Qin 10/2 An O/S perspective on networks Congestion Avoidance and Control, Van Jacobson. Appears in Proceedings of ACM SIGCOMM, Vo1ume 18, Number 4, (August 1988).

TCP Congestion Control with a Misbehaving Receiver, Stefan Savage, Neal Cardwell, David Wetherall and Tom Anderson, Appears in ACM SIGCOMM Computer Communication Review, Volume 29 , Issue 5 (October 1999), pages 71--78.
RouteBricks: Exploiting Parallelism To Scale Software Routers
Mihai Dobrescu and Norbert Egi, Katerina Argyraki, Byung-Gon Chun, Kevin Fall,
Gianluca Iannaccone, Allan Knies, Maziar Manesh, Sylvia Ratnasamy
22nd ACM Symposium on Operating Systems Principles (SOSP), October 2009

Routers for the Cloud. Can the Internet Achieve 5-Nines Availability? Andrei Agapi, Ken Birman, Robert Broberg, Chase Cotton, Thilo Kielmann, Martin Millnert, Rick Payne, Robert Surton, and Robbert VanRenesse. IEEE Internet Computing. Volume 15. Issue 5. pp.72 - 77. September, October 2011.
13 Efe
10/4 An O/S perspective on networks

Note: Ken out of town
U-Net: A User-Level Network Interface for Parallel and Distributed Computing, Von Eicken, Basu, Buch and Werner Vogels. 15th SOSP, December 1995.

Active Messages: A Mechanism for Integrated Communication and Control, Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. In Proceedings of the 19th Annual International Symposium on Computer Architecture, 1992.

Evaluation of the Virtual Interface Architecture (VIA).

Towards an active network architecture, David L. Tennenhouse and David J. Wetherall. Appears in ACM SIGCOMM Computer Communication Review (CCR), Volume 37, Issue 5 (October 2007), pages 81--94.

A Survey of Active Network Research, David L. Tennenhouse, Jonathan M. Smith, W. David Sincoskie, David J. Wetherall, and Gary J. Minden. Appears in IEEE Communications Magazine, Volume 35, Issue 5 (October 1997), pages 80--86.

Fall Break: Oct 6-9
14 Qiming 10/11 An O/S perspective on networks OpenFlow: Enabling innovation in campus networks. Nick McKeown et al. (2008-04).  ACM Communications Review.

Frenetic: A High-Level Langauge for OpenFlow Networks. Nate Foster, Rob Harrison, Matthew L. Meola, Michael J. Freedman, Jennifer Rexford, and David Walker.  In ACM Workshop on Programmable Routers for Extensible Services of Tomorrow (PRESTO), Philadelphia, PA, November 2010.
 Abstractions for Network Update. Mark Reitblatt, Nate Foster, Jennifer Rexford, Cole Schlesinger, and David Walker. In ACM SIGCOMM Conference, Helsinki, Finland, August 2012.
15 Isaac (pdf) 10/16

Ordering and Consistent Cuts

Time, Clocks, and the Ordering of Events in a Distributed System, Lamport. CACM 21(7). July 1978.

Distributed snapshots: Determining global states of distributed systems.,&nsp; Chandy, Lamport. ACM TOCS 3(1), 1985, 63-75.
How Processes Learn. K. Mani Chandy and Jay Misra. In Proceedings of the fourth annual ACM symposium on Principles of distributed computing (PODC '85). ACM, 204-214
16 Bailu (pdf) 10/18 Atomicity Implementing fault-tolerant services using the state machine approach: A tutorial, Fred Schneider. ACM Computing Surveys Volume 22, Issue 4 (December 1990), 299--319.

Sinfonia: A new paradigm for building scalable distributed systems. Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, Christos Karamanolis. November 2009 Transactions on Computer Systems (TOCS), Volume 27 Issue 3
Using Time Instead of Timeout for Fault-Tolerant Distributed Systems, Lamport. ACM TOPLAS 6:2, 1974.

Dangers of Replication and a Solution, Gray et al. ACM SIGMOD, Jun 1996.
17 Ken
(pdf, pptx)
10/23 Group Communication The Process Group Approach to Reliable Distributed Computing, Birman. CACM, Dec 1993, 36(12):37-53.

Overcoming CAP with Consistent Soft-State Replication.. Kenneth P. Birman, D. Freedman, Q. Huang and Patrick Dowell. IEEE Computer Magazine (special issue on “The Growing Impact of the CAP Theorem”). Volume 12. pp. 50-58. February 2012.
CAP Twelve Years Later.  Eric Brewer. IEEE Computer Magazine (special issue on “The Growing Impact of the CAP Theorem”). Volume 12. pp. 23-30. February 2012.  Isis2, A modern, open-source, group communication library for cloud computing applications.  Ken Birman, 2011.
18 Theo
10/25 Paxos Paxos Made Moderately Complex.  Robbert van Renesse.  Cornell University. March 25, 2011

Wiki: Paxos_Protocol

Paxos Made Simple, Lamport. ACM SIGACT NEWS 32(4). Dec. 2001.
Virtually Synchronous Methodology for Dynamic Service Replication. Ken Birman, Dahlia Malkhi, Robbert van Renesse. Submitted for publication. November 18, 2010. Also available as Microsoft Research TechReport MSR-2010-151.

Wiki: Gbcast_Protocol
19 Paul 10/30 Byzantine Agreement The Byzantine Generals Problem, Lamport et al. ACM TOPLAS 4, 1982.

Easy Impossibility Proofs for Distributed Consensus Problems.  Michael J. Fischer , Nancy A. Lynch , Michael Merritt.  ACM PODC 1986.

Practical Byzantine Fault Tolerance, Castro and Liskov. 3rd OSDI, Feb 1999.

Randomized Byzantine Generals, Rabin. FOCS, 1983.

Byzantine Quorum Systems, Malkhi and Reiter.

Fault-Scalable Byzantine Fault-Tolerant Services, Michael Abd-El-Malek SOSP 2005.

Zyzzyva: Speculative Byzantine Fault Tolerance. Ramakrishna Kotla, Lorenzo Alvisi, Mike Dahlin, Allen Clement, and Edmund Wong. 2007. In Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles (SOSP '07). ACM, New York, NY, USA, 45-58.

The Next 700 BFT Protocols. Rachid Guerraoui, Nikola Knezevic, Vivien Quéma and Marko Vukolić. In Proceedings of EuroSys, Paris, France, pp. 363-376, April 2010

20 Dave 11/1 Byzantine Agreement Atomic broadcast: from simple message diffusion to Byzantine agreement.  Flaviu Cristian, Houtan Aghili, Ray Strong, Danny Dolev. Information and Computation 118 (1): 158–179, 1995.  
21 Ken
(pdf, pptx)
11/6 FLP Impossibility of Distributed Consensus with One Faulty Process, Fisher et al. JACM 32(2), Apr 1985.

The weakest failure detector for solving consensus, Chandra et al. J. ACM 43, 4, Jul. 1996.
The Building Blocks of Consensus. Yee Jiun Song, Robbert van Renesse, Fred B. Schneider, Danny Dolev.  The 9th International Conference on Distributed Computing and Networking (ICDCN 08), January, 2008. LNCS, Vol. 4904, pp. 54-72.
22 Ken
(Ymir's pdf, pptx)
11/8 Leveraging Social Network Structures Adaptive Hierarchical
clustering of message flows in a multicast data dissemination
Y. Tock, N. Naaman, A. Harpaz, and G. Gershinsky.  In IASTED PDCS, 2005

.Dr. Multicast: Rx for Data Center Communication Scalability. Ymir Vigfusson, Hussam Abu-Libdeh, Mahesh Balakrishnan, Ken Birman, Robert Burgess, Haoyuan Li, Gregory Chockler, Yoav Tock Eurosys, April 2010 (Paris, France). ACM SIGOPS 2010, pp. 349-362.
The structure and function of complex networks. M. E. J. Newman. SIAM Review, 45:167, 2003.

The web as a graph: Measurements, models, and methods. Jon M. Kleinberg, Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan,
and Andrew S. Tomkins. 1999.

Hierarchical clustering of message flows in a multicast data dissemination system.
Yoav Tock, Nir Naaman, Avi Harpaz, Gidon Gershinsky.  17th IASTED ICPDC, Nov. 2005.
23 Qi 11/13 Gossip for Reliable Multicast

Note: Ken out of town
Epidemic algorithms for replicated database maintenance, Alan Demers, Dan Greene, Carl Hauser, Wes Irish, John Larson, Scott Shenker, Howard Sturgis, Dan Swinehart, Doug Terry. Appears in 6th ACM Symposium on Principles of distributed computing (PODC), August 1987, pages 1--12.

Bimodal Multicast, Birman et al. ACM TOCS 17(2), May 1999.
Managing update conflicts in Bayou, a weakly connected replicated storage system, Doug B. Terry, Marvin M. Theimer, Karin Petersen, Alan J. Demers, Mike J. Spreitzer, and Carl H. Hauser. In Proceedings of the 5th ACM Symposium on Operating Systems Principles (SOSP), December 1995, pages 172--182.

T-Man: Fast Gossip-based Construction of Large-Scale Overlay Topologies.  Mark Jelasity Ozalp Babaoglu.  Technical Report UBLCS-2004-7.  May 2004
24 Erluo 11/15 Gossip for Scalable Data Mining

Note: Ken out of town

Astrolabe: A Robust and Scalable Technology for Distributed System Monitoring, Management, and Data Mining, Robbert Van Renesse, Kenneth P. Birman, Werner Vogels. Appears in ACM Transactions on Computer Systems (TOCS), Volume 21, Issue 2, May 2003, pages 164--206.



Slicing Distributed Systems. Vincent Gramoli, Ymir Vigfusson, Ken Birman, Anne-Marie Kermarrec, Robbert van Renesse. IEEE Transactions on Computers, Special Issue on Autonomic Network Computing, June 2009.

25 Ayush 11/20

Peer to peer/DHT

Chord: A scalable peer-to-peer lookup service for internet applications, Ion Stoica, Robert Moris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. Proceedings of the ACM SIGCOMM, August 2001, 149--160. San Diego, California, United States.


Dynamo: Amazon's Highly Available Key-Value Store. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. In Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles (SOSP '07). ACM, New York, NY, USA, 205-220.

The Impact of DHT Routing Geometry on Resilience and Proximity, Krishna Gummadi , Ramakrishna Gummadi , Steve Gribble , Sylvia Ratnasamy , Scott Shenker, Ion Stoica. Appears in Proceedings of ACM SIGCOMM, August 2003, pages 381--3

SplitStream: high-bandwidth multicast in cooperative environments. Castro, et. al. SOSP 2003.

Overcast: Reliable Multicasting with an Overlay Network, Jannotti et al. 4th OSDI, Dec 2000.

Kelips: Building an Efficient and Stable P2P DHT Through Increased Memory and Background Overhead, Indranil Gupta, Ken Birman, Prakash Linga, Al Demers and Robbert van Renesse. 2nd International Workshop on Peer-to-Peer Systems (IPTPS '03); February 20-21, 2003. Claremont Hotel, Berkeley, CA, USA.

Thanksgiving Holday: Nov 21-25
26 Ken 11/27 Self Stabilization Self-stabilizing systems in spite of distributed control.  Edsger Dijkstra, Communications of the ACM 17 (11): 643–644.

Fast SelfStabilizing
Byzantine Tolerant Digital Clock Synchronization.
Michael BenOr,. Danny Dolev, Ezra N. Hoch.  ACM Principles of Distributed Computing (PODC), July. 2008.

Self-Stabilization.  Dolev, Shlomi (2000), MIT Press, ISBN 0-262-04178-2

Self-stabilizing Byzantine Digital Clock Synchronization.  
Ezra N. Hoch, Danny Dolev and Ariel Daliot.  2008.  In Ajoy Kumar Datta and Maria Gradinariu, editors, SSS, volume 4280 of Lecture Notes in Computer Science.




Final Presentations: Demo Day!

Questions or comments? email

Policy on academic integrity

Ken Birman
Last modified: Fri Dec 9 14:34:25 EST 2011