U-Net: A User-Level Network Interface for Parallel and Distributed Computing

Notes by Xun Wilson Huang
01/01/02


This paper is motivated by the fact that data processing in the end host has become the bottleneck for latency because of the advances in high-speed LAN. This paper aims to reduce the latency in the end host and provide flexibility in protocol building for applications by introducing a new architecture, U-Net. This paper argues that the entire kernel should be removed from the communication path.

In the simplest sense, all U-Net provides to the user are an interface to the device driver and a shared buffer between the user space and kernel. User is supposed manage the buffer on his own in building protocol stacks.

Latency of a network stack consists of the following:

  1. context switch - this no one can get away with. U-Net needs to make system calls in channel creation, tear down,  informing kernel data is available for send in the shared buffer and making up calls when data arrives.
  2. data copy between user-space and kernel ( copyin() and copyout() functions). U-Net avoids this by providing a shared buffer allocated by the kernel and restricts the user to use this buffer only.
  3. processing of data by different layers of protocol. This overhead comes with the functionality provided by these protocols, whether the protocol is in user space or kernel space, this is not avoidable.

How does U-Net work?

  1. User application makes a system call to create an endpoint, this server as a handle to the network.
  2. Application setup channels to demultiplex packets destined for the same endpoint.
  3. Along with the endpoint, the application get a buffer, which is shared between the user space and the kernel.
  4. Application compose the data that it wish to send in that buffer area and compose a descriptor for that data segment and push the descriptor on to the tx queue.
  5. Application traps into the kernel to reflect that something is in the tx queue.
  6. On the receive side, data can only arrive in the buffer came with the endpoint. The application can either poll the rx queue or register a upcall with the module in the kernel for an asynchronous notification.

Zero-copy

Normally, there are 2 copies with normal operating systems doing a send().

  1. Copying from user space to kernel space.
  2. Copying from kernel to the device's buffer.

The second copy can not be avoided without hardware support, like the SBA-200's firmware. Therefore this trick from direct access U-Net is not universally applicable.

The first copy U-Net gets away with restricting the user with the buffer that is provided by the U-Net at endpoint creations. This restriction makes it very inconvenient for the user application, and this leads to a later paper "U-Net with buffer management", which pins user memory down as a send request come in.  However, what does send() in TCP really mean in this zero-copy architecture? In the traditional OS, " TCP send(buf.. ) returns"  means the content of the buffer is copied into the OS's buffer, returning from send() means the buffer can be reused and the content of buf will be sent eventually ( assuming no abnormality occurs). But in U-Net's zero copy structure, the buffer cannot be reused until an ack is received, therefore the application either have to ask for another buffer through its own buffer management or simply block for the ack. For UDP or RPC this works quite well but for TCP copy is still required.

Critiques and questions: