Fbufs: A High-Bandwidth Cross-Domain Transfer Facility


Notes by Yu Zhang, April 7, 1998



Motivation
1. Modular OS system design requires efficient cross-domain invocation facility.
2. Cross-domain invocation facility is measured by:
    - control transfer latency
    - data transfer throughput
    In the setting of I/O intensive applications (e.g.real-time video, digital image retrieval...), the 2nd one is more important.
3. As to network I/O,
    multiple domains: device drivers, network protocols, application software
   Cross-domain data transfer is bounded by CPU/memory bandwidth for high-bandwidth network.

High-level Idea
cross-domain transfer + buffer management
combine two techniques: page remapping + shared memory

Requirements on the buffer management/transfer facility
(Premise: traditional network subsystem ---- a sequence of software layers)
- support both single, contiguous buffers (sender's ADU) and non-contiguous aggregates of buffers ( receiver's ADU)
  due to sender-side fragmentation and receiver-side aggregation
- a data path-specific allocator.
  At the time of allocation, I/O data path that a buffer will traverse is often know (decided by two endpoints).
  It also implies that the locality in network communication could be exploited.
- Use only immutable buffers. So copy semantics can be used for efficiency.
- Either eagerly or lazily raise the protection on a buffer to protect against asynchronous write on it by originator domain
- pageable buffers to avoid memory leak ( malicious domain holds a buffer forever)

Problems of traditional approaches
page remapping
- move semantics: too limited.
- copy semantics: 2 context switch, acquire locks to VM data structures,&nbs p; change VM mappings, perform TLB/cache consistency
shared memory
- compromise protection and security
- only reduce the number of copies, not eliminate copying

Fbufs Design
Basic Mechnism
1. fbufs: 1 or more contiguous VM pages. ( So not for small messages!)
2. aggregate objects : hierarchical structure of fbufs, provide logic ops on fbufs (join, split, clip, etc.)
3. conventional page remapping with copy semantics
4. transfer steps ( see Section 3.1)
Optimizations
Restricted Dynamic Read Sharing
- fbuf region: restrict fbuf allocation from this globally shared VM region, implies fbuf is mapped at
  the same virtual addr. in the orginator and all receivers. ( eliminate finding mapping in the receiver)
- read sharing on fbuf, eliminate the need for copy-on-write.
  based on two typical types of data manipulation: applied to the entire data, or localized to the header/trailer
  such that logical editing functions can be used instead.

Fbuf Caching
- put fbuf in a free list associated with the I/O data path for reuse instead of unmapping and clearing
  it,  increase locality of reference at the level of TLB, cache, and main memory.

Integrated Buffer manager/Transfer
- place the entire aggregate object into fbufs. No translation to/from list of fbufs is needed in sender and receiver.

Volatile fbufs
-  no write protection from the originator, elim. one page table update.
- in integrated buffer manager/transfer, deal with potential damage of integrity of DAG.

Result: In the common case, no kernel involved in cross-domain data transfer. Speed up the common case!

Performance
Micro experienments show that fbufs offer an order of magnitude better throughput than page remapping for a
single domain corssing.  Macro experiments with UDP/IP show when cached/volatile fbufs are used, domain
crossings have virtually no impact on end-to-end throughput for large messages

Discussing Points
1. Basically Fbufs is designed for large messages (>256kB) in the traditiona l layering network subsystem, and
U-net is designed for small msgs. Is there a way that we can make the best of both?
2. Problem with fbuf reclamation: similar to right revocation, malicious domain may fail to deallocate fbufs.
Limiting the quota of fbufs for each data path may be not a decent way to get around this. Can we do better?