Cornell Systems Lunch
CS 7490 Fall 2021
The Systems Lunch is a seminar for discussing recent, interesting papers in the systems area, broadly defined to span operating systems, distributed systems, networking, architecture, databases, and programming languages. The goal is to foster technical discussions among the Cornell systems research community. We meet once a week on Fridays at 11:40 AM Gates 114.
The systems lunch is open to all Cornell Ph.D. students interested in systems. First-year graduate students are especially welcome. Non-Ph.D. students have to obtain permission from the instructor. Student participants are expected to sign up for CS 7490, Systems Research Seminar, for one credit.
Links to papers and abstracts below are unlikely to work outside the Cornell CS firewall. If you have trouble viewing them, this is the likely cause.
|August 27||i10 and blk-switch: Rearchitecting Linux Storage Stack for Low Latency and High Throughput. (video)
There is a widespread belief in the community that it is not possible to achieve µs-scale tail latency when using the Linux kernel stack. Two most frequently cited arguments are (1) Linux has high CPU overheads; and (2) the resource multiplexing principle is so firmly entrenched in Linux that its performance stumbles when multiple applications compete for host resources. I will demonstrate that the above belief may be misplaced, at least for storage-based applications. In particular, I will present a new Linux storage stack architecture that achieves µs-scale latency, even when tens of applications compete for host resources while performing operations at throughput close to hardware capacity. Such a performance can be achieved without any modification in applications, network hardware, kernel CPU schedulers and/or kernel network stack. More details in my NDSI '20 and OSDI '21 papers.
|Jaehyun Hwang (Cornell)|
|September 3||Avenir: Managing Data Plane Diversity with Control Plane Synthesis
The classical conception of software-defined networking (SDN) is based on an attractive myth: a logically centralized controller manages a collection of homogeneous data planes. In reality, however, SDN control planes must deal with significant diversity in hardware, drivers, interfaces, and protocols, all of which contribute to idiosyncratic differences in forwarding behavior that must be dealt with by hand. To manage this heterogeneity, we propose Avenir, a synthesis tool that automatically generates control-plane operations to ensure uniform behavior across a variety of data planes. Our approach uses counter-example guided inductive synthesis and sketching, adding network-specific optimizations that exploit domain insights to accelerate the search. We prove that Avenir’s synthesis algorithm generates correct solutions and always finds a solution, if one exists. We have built a prototype implementation of Avenir using OCaml and Z3 and evaluated its performance on realistic scenarios for the ONOS SDN controller and on a collection of benchmarks that illustrate the cost of retargeting a control plane from one pipeline to another. Our evaluation demonstrates that Avenir can manage data plane heterogeneity with modest overheads.
|Eric Hayden Campbell (Cornell)|
|September 17||Rethinking Our Future Systems in the New Era of Computer Architecture (video)
In this current data-centric era, data generated by social media, video sharing applications, swarms of sensors, and autonomous cars is growing exponentially. Unfortunately, as the technology scaling slows down, the semiconductor industry has been facing a major challenge in providing better performance while processing such large datasets. As a result, we need to innovate how we design our systems to sustain the demand for computing over exponentially growing datasets. However, the fundamental model of computing has not changed over many decades. In our current Von Neumann model, data sits in the slower persistent storage and it is moved back and forth to faster memory for computation by the processor. In this talk, I will present my vision to rethink our current computing model with the advancement of new hardware. I will focus on redefining the hardware and system stack to unify memory and storage with persistent memory.
Samira Khan is an Assistant Professor at the University of Virginia (UVa). Currently, she is also visiting Google Brain. The goal of her research group, ShiftLab, is to introduce a paradigm shift in the current computing system by fundamentally rethinking our current processor-centric computing model. Her group puts a significant effort into building new tools, artifacts, and frameworks for emerging technologies. She hosts “Happy Hour with Architects”, where prominent people from academia and industry discuss and debate research trends and directions in computer architecture and systems.
|Samira Khan (University of Virginia)|
|September 24||Over-probisioned WANs for next-generation services (video)
The last decade has seen a large-scale commercialization of cloud computing and the emergence of global cloud providers. Cloud providers have rapidly expanded their datacenter deployments, network equipment and backbone capacity, preparing their infrastructure to meet the growing client demands. In this talk, I will re-examine the design and operation choices made by cloud providers in this phase of exponential growth using a cross-layer empirical analysis of the wide-area network (WAN) of a large commercial cloud provider. First, I will demonstrate how the knowledge of optical signal quality can enable traffic engineering systems to harness 75% more capacity from 80% of the optical wavelengths in the cloud backbone. Second, I will present the opportunity to minimize the hardware costs of provisioning long-haul WAN capacity by optically bypassing network hops where conversion of signals from optical to electrical domain is unnecessary and uneconomical. Identifying and fixing these inefficiencies in the operation of today's cloud networks is crucial for enabling next-generation cloud services.
Rachee Singh (http://www.racheesingh.com/) is a senior researcher in the office of the CTO at Azure for Operators. Before this, she was a researcher in the Mobility and Networking group of Microsoft Research, Redmond. Her research interests are in computer networking with a focus on wide area network performance and monitoring. She has a PhD in Computer Science from the University of Massachusetts, Amherst and is a recipient of the Google PhD fellowship in Systems and Networking. Recently, she was named a rising star in computer networking by N2Women and a rising star in EECS by UC Berkeley. In a previous life, she developed routing protocol features for Ethernet switches at Arista Networks.
|Rachee Singh (MSR)|
|October 1||Towards CPU-free Remote Procedure Calls in Datacenters (video)
With the increasing shift of cloud applications towards microservices and interactive workloads, efficient and fast networking became one of the key requirements in the modern data center systems. Recent proposals of high-efficiency and low-latency networking using optimized user-space stacks and specialized adapters (e.g. RDMA, FPGA, SmartNICs) have already been deployed by the major cloud providers. While these advances offer dramatic reduction of communication overheads in cloud applications and improve their overall efficiency, they still encounter certain performance penalties and often suffer from poor programmability/flexibility and lack of abstractions. In this work, we present Dagger -- a further extension of specialized programmable networking adapters designed specifically to offload end-to-end cloud RPC stacks to reconfigurable hardware. In contrast to previous proposals, our programmable FPGA-based NIC features full networking offload up to the application layer, reconfigurability, and closed coupling with the host processor over a memory interconnect instead of the conventional PCIe bus. We show that the combination of these three principles improves end-to-end latency, throughput, and CPU efficiency of cloud RPC stacks under the networking footprints of today's workloads while providing the same level of flexibility and abstraction as existing mainstream RPC systems based on software-only implementations.
Nikita Lazarev is a third year PhD student in School of Electrical and Computer Engineering at Cornell University, under the supervision of Profs. Christina Delimitrou and Zhiru Zhang. His research interests lie at the intersection of computer hardware and systems with applications in distributed systems and networking. His recent research focuses on efficient datacenter networking enabled by reconfigurable in-network hardware. Nikita obtained his undergraduate degree in Electrical Engineering from Bauman Moscow State Technical University and Master’s degree in Computer Science from EPFL. In the past, he interned at Microsoft Research India, Microsoft Research Cambridge, and Microsoft Research Redmond where he worked on FPGA-enabled low-precision ML, CPU-free datastore systems, and cloud native 5G networks respectively.
|Nikita Lazarev (Cornell)|
|October 8||Formal Support for the POSIX Shell (video)
The POSIX shell is a widely deployed, powerful tool for managing computer systems. The shell is the expert’s control panel, a necessary tool for configuring, compiling, installing, maintaining, and deploying systems. Even though it is powerful, critical infrastructure, the POSIX shell is maligned and misunderstood. Its power and its subtlety are a dangerous combination. How can we support the POSIX shell? I'll describe two recent lines of work---Smoosh, a formal, mechanized, executable small-step semantics for the POSIX shell---and ffs---a tool for helping users manipulate semi-structured data (like JSON and YAML) in the shell. I'll also discuss ongoing work on PaSh with Konstantinos Kallas, Nikos Vasilakis, and others.
Michael Greenberg is an assistant professor at the Stevens Institute of Technology, having recently moved from Pomona College. He received BAs in Computer Science and Egyptology from Brown University (2007) and his PhD in Computer Science from the University of Pennsylvania (2013). He was born in Ithaca.
|Michael Greenberg (Stevens Institute of Technology)|
|October 15||CockroachDB's Query Optimizer (video)
We live in an increasingly interconnected world, with many organizations operating across countries or even continents. To serve their global user base, organizations are replacing their legacy DBMSs with cloud-based systems capable of scaling OLTP workloads to millions of users. CockroachDB is a scalable SQL DBMS that was built from the ground up to support these global OLTP workloads while maintaining high availability and strong consistency. Just like its namesake, CockroachDB is resilient to disasters through replication and automatic recovery mechanisms. In this talk, I'll give a brief introduction to the architecture of CockroachDB followed by a deep dive into the design and implementation of CockroachDB's query optimizer. CockroachDB has a Cascades-style query optimizer that uses over 200 transformation rules to explore the space of possible query execution plans. In this talk, I'll describe the domain-specific language, Optgen, that we use to define these transformation rules, and demonstrate how the rules work in action. I'll explain how we use statistics to choose the best plan from the search space, and how we automatically collect stats without disrupting production workloads or requiring coordination between nodes. I'll also describe some of the unique challenges we face when optimizing queries for a geo-distributed environment, and how CockroachDB handles them.
Becca is a Staff Engineer at Cockroach Labs where she is the Tech Lead of the SQL Queries team. Prior to joining Cockroach Labs, she was a graduate student at MIT, where she worked with Professor Michael Stonebraker researching distributed database elasticity and multi-tenancy. Becca holds a B.S. in Physics from Yale University and an M.S. and Ph.D. in Computer Science from MIT. In her free time, she enjoys rowing on the Chicago River and enjoying the great outdoors.
|Rebecca Taft (MIT, Cockroach Labs)|
|October 22||Toward Intrusion-Tolerant Critical Infrastructure (video)
As critical infrastructure systems are becoming increasingly exposed to malicious attacks, it is crucial to ensure that they can withstand sophisticated attacks while continuing to operate correctly and at their expected level of performance. In this talk, I will present our work toward making intrusion-tolerant critical infrastructure systems possible and practical. I will start by discussing our Spire system, the first Supervisory Control and Data Acquisition (SCADA) system for the power grid that is resilient to both system-level compromises and sophisticated network-level attacks. Spire uses Byzantine-fault-tolerant replication with performance guarantees under attack, proactive recovery, and diversity to overcome system-level compromises of the SCADA master, and employs an intrusion-tolerant network service combined with a novel multi-site deployment framework to overcome network-level attacks. Then, I will present our recent work developing a practical deployment path for Spire and similar BFT-based systems through a new model for "intrusion tolerance as a service". The intrusion-tolerance-as-a-service model enables critical infrastructure operators to gain the resilience benefits of intrusion tolerance, while offloading significant parts of the system management to a service provider. Critically for practical acceptance, our work shows how these benefits can be achieved without requiring critical infrastructure operators to expose confidential or proprietary data and algorithms to the service provider.
|Amy Babay (University of Pittsburgh)|
|October 29||No Lecture --- SOSP Session
|November 5||In-network Resource Management for Disaggregated Datacenters (video)
Over the last few years, significant improvements in inter-server network performance, coupled with stagnating intra-server interconnect performance, have driven advances in data center resource disaggregation. Disaggregation promises better resource utilization, support for hardware heterogeneity, and complete resource elasticity for applications, but actualizing these benefits while ensuring application performance requires operating system (OS) support. Unfortunately, existing approaches expose a hard tradeoff between application performance on one hand and resource elasticity on the other. Our driving vision is a fundamentally new network-centric design for the disaggregated OS — one that places resource management and access functionality in the data center network fabric to break the above tradeoff. In this talk, I will present our approach to realize this vision, with a focus on the first step: MIND, an in-network memory management unit (MMU) for rack-scale disaggregated architectures. MIND demonstrates that emerging programmable network switches can enable an efficient shared memory abstraction for disaggregated architectures by placing memory management logic in the network fabric, achieving transparent resource elasticity while matching the performance of prior memory disaggregation proposals for real-world workloads. I will also talk about what lies ahead in realizing our network-centric OS for disaggregated architectures.
Anurag Khandelwal is an Assistant Professor of Computer Science at Yale University, where his group focuses broadly on problems in computer systems and networks. Prior to joining Yale, he spent a semester as a post-doc at Cornell, working with Tom Ristenpart and Rachit Agarwal. He received his PhD from UC Berkeley, where he was advised by Ion Stoica. He is the recipient of the NSF CAREER award, NetApp faculty fellowship and a distinguished paper award at USENIX Security’20.
|Anurag Khandelwal (Yale University)|
|November 12||Bao: Making Learned Query Optimization Practical (video)
Recent efforts applying machine learning techniques to query optimization have shown few practical gains due to substantive training overhead, inability to adapt to changes, and poor tail performance. Motivated by these difficulties, we introduce Bao (the Bandit optimizer). Bao takes advantage of the wisdom built into existing query optimizers by providing per-query optimization hints. Bao combines modern tree convolutional neural networks with Thompson sampling, a well-studied reinforcement learning algorithm. As a result, Bao automatically learns from its mistakes and adapts to changes in query workloads, data, and schema. Experimentally, we demonstrate that Bao can quickly learn strategies that improve end-to-end query execution performance, including tail latency, for several workloads containing long-running queries. In cloud environments, we show that Bao can offer both reduced costs and better performance compared with a commercial system.
Victor is a first year PhD student working on query optimization for polystores.
|Victor Giannakouris (Cornell University)|
|November 19||Syrup: User-Defined Scheduling across the Stack (video)
Suboptimal scheduling decisions in operating systems, networking stacks, and application runtimes are often responsible for poor application performance, including higher latency and lower throughput. These poor decisions stem from a lack of insight into the applications and requests the scheduler is handling and a lack of coherence and coordination between the various layers of the stack, including NICs, kernels, and applications. We propose Syrup, a framework for user-defined scheduling. Syrup enables untrusted application developers to express application-specific scheduling policies across these system layers without being burdened with the low-level system mechanisms that implement them. Application developers write a scheduling policy with Syrup as a set of matching functions between inputs (threads, network packets, network connections) and executors (cores, network sockets, NIC queues) and then deploy it across system layers without modifying their code. Syrup supports multi-tenancy as multiple co-located applications can each safely and securely specify a custom policy. We present several examples of uses of Syrup to define application and workload-specific scheduling policies in a few lines of code, deploy them across the stack, and improve performance up to 8x compared with default policies.
Kostis Kaffes is a final-year Ph.D. candidate in Electrical Engineering at Stanford University, advised by Christos Kozyrakis. He is broadly interested in computer systems, cloud computing, and scheduling. His thesis focuses on end-host, rack-scale, and cluster-scale scheduling for microsecond-scale tail latency. Recently, he has been looking for ways to make it easier to implement and deploy custom scheduling policies across different layers of the stack. Kostis's research has been supported by a Facebook Research Award and various scholarships and fellowships from Stanford, A.G. Leventis Foundation, and Gerondelis Foundation. Prior to Stanford, he received his undergraduate degree in Electrical and Computer Engineering from the National Technical University of Athens in Greece.
|Kostis Kafes (Stanford)|
|November 26||No Lecture -- Thanksgiving
|December 3||Compiler Infrastructure for Accelerator Generators
Specialized, application-specific hardware accelerators are chipping away at the dominance of traditional, general-purpose CPUs. We need to make it possible for domain experts—not just hardware experts—to harness the efficiency of hardware specialization for the computations they care about. The tools we have to design custom accelerators operate at the level of gates, wires, and clock cycles, and are preventing acceleration from going mainstream. Domain-specific languages (DSLs) for building hardware accelerators offer a way to raise the level of abstraction in hardware design. Unfortunately, building a new hardware DSL is a gargantuan task requiring not only the design of new abstractions, but also supporting tools such as an optimizing compiler, testing and debugging infrastructure, etc. Our solution to these problems is Calyx, an intermediate language and a compiler infrastructure that can represent, optimize, and lower accelerators to synthesizable hardware designs. By targeting Calyx instead of a traditional hardware design language, designers can build new DSLs and generate a custom hardware accelerator in a matter of hours.
Rachit Nigam is PhD candidate at Cornell University interested in programming languages, computer architectures, and compilers that turn programs to architectures.
|Rachit Nigam (Cornell)|