U-Net: A User-Level Network Interface for Parallel and Distributed Computing
Notes by Xun Wilson Huang
01/01/02
This paper is motivated by the fact that data processing in the end host has become the
bottleneck for latency because of the advances in high-speed LAN. This paper aims to
reduce the latency in the end host and provide flexibility in protocol building for
applications by introducing a new architecture, U-Net. This paper argues that the entire
kernel should be removed from the communication path.
In the simplest sense, all U-Net provides to the user are an interface to the device
driver and a shared buffer between the user space and kernel. User is supposed manage the
buffer on his own in building protocol stacks.
Latency of a network stack consists of the following:
- context switch - this no one can get away with. U-Net needs to make system calls in
channel creation, tear down, informing kernel data is available for send in the
shared buffer and making up calls when data arrives.
- data copy between user-space and kernel ( copyin() and copyout()
functions). U-Net avoids this by providing a shared buffer allocated by the kernel and
restricts the user to use this buffer only.
- processing of data by different layers of protocol. This overhead comes with the
functionality provided by these protocols, whether the protocol is in user space or kernel
space, this is not avoidable.
How does U-Net work?
- User application makes a system call to create an endpoint, this server as a handle to
the network.
- Application setup channels to demultiplex packets destined for the same endpoint.
- Along with the endpoint, the application get a buffer, which is shared between the user
space and the kernel.
- Application compose the data that it wish to send in that buffer area and compose a
descriptor for that data segment and push the descriptor on to the tx queue.
- Application traps into the kernel to reflect that something is in the tx queue.
- On the receive side, data can only arrive in the buffer came with the endpoint. The
application can either poll the rx queue or register a upcall with the module in the
kernel for an asynchronous notification.
Zero-copy
Normally, there are 2 copies with normal operating systems doing a send().
- Copying from user space to kernel space.
- Copying from kernel to the device's buffer.
The second copy can not be avoided without hardware support, like the SBA-200's
firmware. Therefore this trick from direct access U-Net is not universally applicable.
The first copy U-Net gets away with restricting the user with the buffer that is
provided by the U-Net at endpoint creations. This restriction makes it very inconvenient
for the user application, and this leads to a later paper "U-Net with buffer
management", which pins user memory down as a send request come in. However,
what does send() in TCP really mean in this zero-copy architecture? In the traditional OS,
" TCP send(buf.. ) returns" means the content of the buffer is copied into
the OS's buffer, returning from send() means the buffer can be reused and the content of
buf will be sent eventually ( assuming no abnormality occurs). But in U-Net's zero copy
structure, the buffer cannot be reused until an ack is received, therefore the application
either have to ask for another buffer through its own buffer management or simply block
for the ack. For UDP or RPC this works quite well but for TCP copy is still required.
Critiques and questions:
- "remove kernel completely from the critical path"? what about
system-wide resources that needs to be shared among different applications. TCP/UDP port
number space? ARP table? routing table? where should these things be kept? .
- Flexibility. To build new communication protocols, one can use raw socket together with
a mechanism for asking kernel to allocate buffers in the kernel. To customize protocols, I
think it's better to have the OS provide a protocol layering mechanism for easy insertion
and by-pass, rather than having the entire protocol stack (customized) appear in the
user space. For mortals like me, moving the protocol stack up to the user space is not
easy.
- For easy developing and debugging of network protocols, one can consider the SurReal
instead, the new network simulator currently being developed at Cornell.
- Tagging the packet. Tags are used for U-Net to demultiplex incoming packet into
different endpoints. This makes it impossible for U-Net to communicate with a
regular stack. This motives the work on packet filtering later.
- This paper did not address how to identify machines with U-Net. And in the TCP
implementation, it says "module not in the criticial performance path such as ARP are
not ported to U-Net". 1. Looking up an arp entry is in the critical path. 2. Without
ARP, U-Net is identifying machines with hardware address?
- This paper also argues that having stack in the user space can allow easy query of the
network param for feedbacks. This type of query can be done in a traditional OS with
getsockopt(), ioctl().
- Having stack in the user space also allows the user application to corrupt the stack. No
queuing pocliy is addressed in the paper. If FCFS is used, the corruptions of one stack
can affect everyone else.