Next-generation Storage I/O Stack
Two recent trends in remote storage access:
- High-performance storage devices (NVMe SSDs) and high-speed networks (40Gbps and beyond)
- Disaggregated storage
Implication?
- Performance bottlenecks pushed back to software stacks.
- Increasingly higher overlap between storage and network data planes.
Our goal: Designing a new storage I/O stack
- Achieving high throughput comparable to state-of-the-art NVMe-over-RDMA.
- Supporting multiple tanents that have different requirements.
Problems
1. Low remote block I/O throughput
What is the current status?
- Traditional iSCSI protocol requires:
- 14 CPU cores to saturate a 1M IOPS SSD.
- 56 CPU cores to saturate a 100Gbps link.
- User-level approaches require:
- Changes in applications and/or networks.
- NVMe-over-Fabrics:
- (RDMA) Changes network infrastructure.
- (TCP) Suffers low performance.
Technical challenge: “Can we push back the performance bottlenecks to the hardware, requiring no modifications in applications/networks?”
2. Multi-tenancy support
How to support various tenants’ requirements?
Each host/target server can have multiple tenants with different target devices, requirements, resources, and so on.
- (As-Is) The current storage I/O stack creates per-core queues, and each I/O request takes a static data path to the target device on each core.
- (To-Be) The request should be able to change its data path dynamically at low cost, in order to satisfy all tenants’ requirements (e.g., throughput, latency, etc.).
Technical challenge: “What would be the best abstraction to support multi-tenancy at block device layer?”
Download
(To be updated..)
References
User-level stacks:
OS-level approaches:
RDMA-based solutions
NVM Express: