Networks that never drop packets
Need for Lossless Networks
-
High Performance w/ Low CPU overhead: At today’s bandwidths (40-100 Gbps), OS has become a bottleneck. Lossless networks require much simpler transport & allow easier hardware offload bypassing kernel, providing high throughput and ultra-low latency at minimal CPU overhead.
-
Eliminating Large & Unexpected (Tail) Latencies: Packet drops and large unbounded queueing adversely impact the latency, especially degrading performance for interactive applications, with tail latencies that can be orders of magnitudes larger than the median.
-
Enabling Next-Gen Infrastructure: Reliable low-latency and high throughput, enabled by lossless fabrics, are essential to realize ongoing datacenter trends like high speed remote I/O, remote memory & resource disaggregation.
Existing Lossless Mechanisms Are Insufficient
- Distributed Schemes – Credit-based flow control (Infiniband, QuickPath, PCIe, etc) and PFC
- Scalable to datacenter topologies
- No throughput guarantees, can lead to congestion collapse
- Other known associated problems (HOL blocking, deadlocks, congestion spreading, etc)
- Centralized Scheme – Fastpass
- Worst-case throughput guarantees
- Not scalable
The goal of this project is to design network fabrics (and end-host stacks) for datacenter topologies that guarantee, for arbitrary input workloads:
- Zero packet drops in the network;
- Near-optimal network utilization; and
- Scalable, decentralized design
Current Problems
Designing network fabrics with the following properties:
1. Bounded Queueing w/ Throughput Guarantees
- Loco logically decomposes tree-topology to multiple single-switches, each scheduled independently
- Scheduling performed at each logical switch via computing graph matchings, providing near-optimal utilization
- Clean slate design, implemented using FPGAs

2. Zero Queueing w/ Throughput Guarantees
- DZQ provides even stronger guarantees – deterministic zero-queueing in network switches
- Admission control perfomed using techniques from graph matching and edge-coloring
- Online, fully distributed mechanism, implementable using available programmable switches

3. Zero Queueing w/ Throughput Guarantees w/ Commodity Hardware
Currently trying to solve the problem of ensuring deterministic zero-queueing without any support from the network.
Download
git clone https://github.com/sakshamagarwals/lossless
(Will be available here soon..)