Cornell Systems Lunch, Fall 2017

The Systems Lunch is a seminar for discussing recent, interesting papers in the systems area, broadly defined to span operating systems, distributed systems, networking, architecture, databases, and programming languages. The goal is to foster technical discussions among the Cornell systems research community. We meet once a week on Fridays at noon in Gates 114.

The systems lunch is open to all Cornell Ph.D. students interested in systems. First-year graduate students are especially welcome. Non-Ph.D. students have to obtain permission from the instructor. Student participants are expected to sign up for CS 7490, Systems Research Seminar, for one credit.

To join the systems lunch mailing list please send an empty message to cs-systems-lunch-l-request@cornell.edu with the subject line "join". More detailed instructions can be found here.

Links to papers and abstracts below are unlikely to work outside the Cornell CS firewall. If you have trouble viewing them, this is the likely cause.

Date	Paper	Presenter
August 25	The Stellar Consensus Protocol: A Federated Model for Internet-level Consensus David Mazieres	Isaac Sheff
September 1	REM: Resource-Efficient Mining for Blockchains Fan Zhang, Ittay Eyal, Robert Escriva, Aria Juels, and Robbert van Renesse Usenix Security 2017	Fan Zhang
September 8	Quickr: Lazily Approximating Complex Ad-Hoc Queries in Big Data Clusters Srikanth Kandula, Anil Shanbhag, Aleksandar Vitorovic, Matthaios Olma, Robert Grandl, Surajit Chaudhuri, Bolin Ding SIGMOD 2016	Ayush Dubey
September 15	What (New) Bugs Live in the Cloud? As more data and computation move from local to cloud environments, datacenter distributed systems have become a dominant backbone for many modern applications. However, the complexity of cloud-scale hardware and software ecosystems has outpaced existing testing, debugging, and verification tools. I will describe three new classes of bugs that often appear in large-scale datacenter distributed systems: (1) distributed concurrency bugs, caused by non-deterministic timings of distributed events such as message arrivals as well as multiple crashes and reboots; (2) limpware-induced performance bugs, design bugs that surface in the presence of "limping" hardware and cause cascades of performance failures; and (3) scalability bugs, latent bugs that are scale dependent, typically only surface in large-scale deployments (100+ nodes) but not necessarily in small/medium-scale deployments. The findings above are based on our long, large-scale cloud bug study (3000+ bugs) and cloud outage study (500+ outages). I will present some of our work in understanding and combating distributed concurrency bugs, mainly focusing on our semantic-aware implementation-level model checking (SAMC) and taxonomy of distributed concurrency bugs (TaxDC). If time permits, I will also briefly discuss limpware and scalability bugs. Haryadi Gunawi is a Neubauer Family Assistant Professor in the Department of Computer Science at the University of Chicago where he leads the UCARE research group (UChicago systems research on Availability, Reliability, and Efficiency). He received his Ph.D. in Computer Science from the University of Wisconsin, Madison in 2009. He was a postdoctoral fellow at the University of California, Berkeley from 2010 to 2012. His current research focuses on cloud computing reliability and new storage technology. He has won numerous awards including NSF CAREER award, NSF Computing Innovation Fellowship, Google Faculty Research Award, NetApp Faculty Fellowships, and Honorable Mention for the 2009 ACM Doctoral Dissertation Award.	Haryadi Gunawi (University of Chicago)
September 22	vCorfu: A Cloud-Scale Object Store on a Shared Log Michael Wei, University of California, San Diego, and VMware Research Group; Amy Tai, Princeton University and VMware Research Group; Christopher J. Rossbach, The University of Texas at Austin and VMware Research Group; Ittai Abraham, VMware Research Group; Maithem Munshed, Medhavi Dhawan, and Jim Stabile, VMware; Udi Wieder and Scott Fritchie, VMware Research Group; Steven Swanson, University of California, San Diego; Michael J. Freedman, Princeton University; Dahlia Malkhi, VMware Research Group NSDI 2017	Youer Pu
September 29	ViewMap: Sharing Private In-Vehicle Dashcam Videos Minho Kim, Jaemin Lim, Hyunwoo Yu, Kiyeon Kim, Younghoon Kim, and Suk-Bok Lee, Hanyang University NSDI 2017	Edward Tremel
October 6	TensorFlow: A System for Large-Scale Machine Learning Martín Abadi et al. OSDI 2016	Matthew Milano
October 13	Sub-millisecond Stateful Stream Querying over Fast-evolving Linked Data Yunhao Zhang, Rong Chen, Haibo Chen (Shanghai Jiao Tong University) SOSP 2017	Yunhao Zhang
October 20	Building Automation Systems for the Enterprise Despite advances in computer science, your typical large-scale enterprise company still runs primarily on "carbon" -- large numbers of human workers running the core business functions. This carbon workforce consists of millions of people worldwide performing manual, repetitive, and (nearly) deterministic tasks on a daily basis that in many cases a computer is much better equipped to perform. In this talk, we will first discuss why enterprise companies (even "tech giants" in the Fortune 50) are still so much behind the technology curve. And with this understanding, how advances in computer science can help companies catch up by shifting this work to "silicon." Next, will describe technical and systems challenges that arise when building complex automation systems that are deployed in client environments, Soroco	George Nychis (Soroco)
October 27	Programmable Topologies Fiber optic cables are the workhorses of today’s Internet services. Operators spend millions of dollars to purchase, lease and maintain their optical backbone, making the efficiency of fiber essential to their business. In this talk, I will make a case for programmable topologies. ProjecToR [SIGCOMM’16] is a programmable data center interconnect that uses free-space optics between racks. Our design enables all rack-pairs to communicate via direct links. We use a digital micromirror device (DMD) and mirror assembly combination as a transmitter and a photodetector on top of the rack as a receiver. We built a prototype that points to the feasibility of our approach. Simulations and analysis show that, for realistic data center workloads, it can improve mean flow completion time by 30-95%, while reducing cost by 25-40%. Next, I will present the results of the first ever large scale study on performance of optical links in a backbone network carrying live traffic [HotNets’17]. Our data-driven analysis coupled with simulations showed that existing fiber deployment can be driven towards much greater efficiency by enabling programmable modulations. For example, 99% of Microsoft’s 100 Gbps channels can be augmented to 150 Gbps, by simply changing the modulation formats at the two ends without touching the fiber or intermediate amplifiers. Even better, 43% can double their capacity and carry up to 200 Gbps. This way, using the same fiber paths, we get more bits, less space, and less power. This project has moved the industry into adopting bandwidth variable transponders in the WAN. Monia Ghobadi is a researcher at Microsoft Research, Redmond, WA. Her research interests include all aspects of networked systems. Currently, she leads the optical networking research in Redmond lab. Her past work spans data center congestion control, RDMA, software-defined networks, and network measurement. This year, she was recognized as the N2women rising stars n networking and communications. She received her Ph.D. from the University of Toronto and worked at Google’s data center team before joining Microsoft Research. Many of the technologies that she has helped develop are part of real-world systems at Microsoft and Google. Her papers have won best dataset award (IMC 2016), Google research excellent paper award (USENIX ATC 2012), and best paper award (IMC 2008).	Monia Ghobadi
November 3	Why Your Encrypted Database Is Not Secure Paul Grubbs, Thomas Ristenpart, Vitaly Shmatikov HotOS 2017	Paul Grubbs
November 10	Timely, Reliable, and Cost-Effective Internet Transport Service using Dissemination Graphs Amyh Babay, Emily Wagner, Michael Dinitz, and Yair Amir ICDCS 2017	Amy Babay (JHU)
November 17	ACSU Luncheon, no meeting.
November 24	Thanksgiving Break, no meeting.
December 1	Programming with People Humans can perform many tasks with ease that remain difficult or impossible for computers. Crowdsourcing platforms like Amazon's Mechanical Turk make it possible to harness human-based computational power on an unprecedented scale. However, their utility as a general-purpose computational platform remains limited. The lack of complete automation makes it difficult to orchestrate complex or interrelated tasks. Scheduling human workers to reduce latency costs real money, and jobs must be monitored and rescheduled when workers fail to complete their tasks. Furthermore, it is often difficult to predict the length of time and payment that should be budgeted for a given task. Finally, the results of human-based computations are not necessarily reliable, both because human skills and accuracy vary widely, and because workers have a financial incentive to minimize their effort. This talk presents AutoMan, the first fully automatic crowdprogramming system. AutoMan integrates human-based computations into a standard programming language as ordinary function calls, which can be intermixed freely with traditional functions. This abstraction allows AutoMan programmers to focus on their programming logic. An AutoMan program specifies a confidence level for the overall computation and a budget. The AutoMan runtime system then transparently manages all details necessary for scheduling, pricing, and quality control. AutoMan automatically schedules human tasks for each computation until it achieves the desired confidence level; monitors, reprices, and restarts human tasks as necessary; and maximizes parallelism across human workers while staying under budget.	Emery Berger (UMass)

Cornell Systems Lunch

CS 7490 Fall 2017
Friday 12PM, Gates 114

Emin Gun Sirer and Robbert van Renesse

Other semesters:

Cornell Systems Lunch

CS 7490 Fall 2017 Friday 12PM, Gates 114

Emin Gun Sirer and Robbert van Renesse

Other semesters:

CS 7490 Fall 2017
Friday 12PM, Gates 114