Notes on Fbuf: A High-bandwidth Cross-Domain Transfer Facility" by Linda Wu Summary Authors present a new method to do buffer management to elimate extra copy that happen in the cross-domain transfer. The technique used here is the combination of two existing techniques:page re-mapping and share memory. Motivation As we moving toward to micro-kernel model, components like protocol, drivers, and software applications are in different domains. Network speed is getting faster, however the memory stays relative same. Network bound may soon become cpu-memory bound. How to provide fast buffer transfer across domain becomes a challenge. Buffer Management in Network Subsystem - support both single, contiguous buffers, non-contiguous aggregates of buffers - At the time of allocation, I/O data path that a buffer will traverse is often know, hence a data path-specific allocator. - Use only immutable buffers. Consequently, providing only copy semantics. - two mechanism to protect asynchronous access of buffer: 1. enforce immutability by raising the protection on a buffer when the originator transfers it 2. lazily raise the protection upon request by a receiver. - pageable buffers Problems in Using Page-remapping and Share Memory: Page-remapping - used only system support VM. There're some overhead:the time it takes to switch to supervisor mode, acquire necessary locks to VM data structures, change VM mappings(can be at several levels) for each page, perform TLB/cache consistency, and return to user mode. Shared Memory - globally shared memory compromise security, pairwise shared memory requires copying when data is either not immediately consumed or is forwarded to a third domain, and group-wise shared memory requires that the data path of a buffer is always known at the time of allocation. Key Design Restricted Dynamic Read Sharing - limited ranged to fbuf region, implies orginator and receivers are mapped to same virtual address. This eliminates the finding free VA for receiver. - strict rule on write acess which eliminates the need for a COW mechanism Caching - put fbuf in the free list after using it instead of unmapped and clearing the buffer. It reduces the number of page table update to two. increase locality of reference at the level of TLB, cache, and main memory. Integrated Buffer manager/Transfer Volatile fbufs - eliminated write permission from the originalator, hence one less page table update. Caution: together with integrated buffer manager and transfer, there is potential problem with DAG. Performance Done on DECStation2000 with a prototype ATM borad, Osiris and a null modem support a link spead of 622 Mbps. Results at micro experienment show fbufs offer an order of magnitude better throughput than page remapping for a single domain corssing. Macro experiments with UDP/IP show when cached/volatile fbufs are used, domain corsssings have virtualy no impact on end-to-end throughput for a large messages Comments: Things to take: Analyze the behavior of memory access of a netowrk subsystem, and come up with the requirements design a system so that locality is there. Personal rating: I don't find this paper ease of reading.