Next-generation Storage I/O Stack

Two recent trends in remote storage access:

  • High-performance storage devices (NVMe SSDs) and high-speed networks (40Gbps and beyond)
  • Disaggregated storage

Implication?

  • Performance bottlenecks pushed back to software stacks.
  • Increasingly higher overlap between storage and network data planes.

Our goal: Designing a new storage I/O stack

  • Achieving high throughput comparable to state-of-the-art NVMe-over-RDMA.
  • Supporting multiple tanents that have different requirements.

Problems

1. Low remote block I/O throughput

What is the current status?

  • Traditional iSCSI protocol requires:
    - 14 CPU cores to saturate a 1M IOPS SSD.
    - 56 CPU cores to saturate a 100Gbps link.
  • User-level approaches require:
    - Changes in applications and/or networks.
  • NVMe-over-Fabrics:
    - (RDMA) Changes network infrastructure.
    - (TCP) Suffers low performance.

Technical challenge: “Can we push back the performance bottlenecks to the hardware, requiring no modifications in applications/networks?”

2. Multi-tenancy support

How to support various tenants’ requirements?

Each host/target server can have multiple tenants with different target devices, requirements, resources, and so on.

  • (As-Is) The current storage I/O stack creates per-core queues, and each I/O request takes a static data path to the target device on each core.
  • (To-Be) The request should be able to change its data path dynamically at low cost, in order to satisfy all tenants’ requirements (e.g., throughput, latency, etc.).

Technical challenge: “What would be the best abstraction to support multi-tenancy at block device layer?”

Papers

Download

(To be updated..)

Members

References

User-level stacks:

OS-level approaches:

RDMA-based solutions

NVM Express: