An Analysis of TCP Processing Overhead

Notes for CS614

Craig Michael Nathan

Index

  • Heuristics Employed
  • I. Background from Keshav

    Here is the E-Mail sent out by Prof. Keshav on this paper:

    The second paper is a detailed analysis of TCP overhead. Until this
    paper was published, people "knew" that TCP was expensive, and
    we needed lightweight transport layer protocols. However, Clark
    et al showed that, in the common case, both TCP and IP are pretty
    cheap to implement. This paper basically killed research into
    lightweight transport. As you read the paper, see how they carefully
    seperate the work required to perform protocol actions from work
    done to support the protocol (i.e. buffer management, timers etc).
    The first part is the intrinsic work in TCP, and the latter is
    system dependent. The goal is to show that TCP is pretty efficient,
    if we ignore system overheads. Of course, we do need to worry about
    the implementation environment in a real system. Also, note the
    many tricks, mostly having to do with making the common case more
    efficient. 
    


    II. Quick Overview of the Paper

    I found this paper to be extremely well written and very informative. I feel the paper is straight-forward, thus I will spend little time on summary and more on the pieces I found interesting -- the underlying heuristics used to anaylze the TCP protocol.

    IIa. Summary

    As Prof. Keshav states in his E-Mail, the purpose of this paper is to research the "known truth" that TCP, the Transport Control Protocol used in the TCP/IP system of the Internet, is "a likely source of processing overhead" which could explain the low throughput observed in TCP/IP.

    The conclusion of the researchers was the exact opposite -- they found that TCP was in fact not the source of the overhead observed in packet processing, and other aspects of the network were the true culprit.

    IIb. Brief Overview of TCP

    (More detailed information on the TCP/IP protocol can be found on this page I found a Yale, or here at Yahoo.)

    As mentioned above, TCP is the transport layer of the protocol of the Internet, TCP/IP. To understand what role TCP plays in network communication, it is helpful to view TCP/IP in terms of a protocol reference Model similar to the OSI model.


    Figure 1 -- Communication in TCP/IP


    Note that TCP is only invoked at the end nodes, packet routing through the network is handled only by IP.

    The main jobs of TCP are:

    IIc. Conclusions

    Here is a list of the conclusions, along with page references:


    III. Heuristics Employed

    Although the paper was on the analysis of TCP overhead, I found several excellent general heuristics used by the authors, and would like to spend the rest of this summary discussing them. References to examples in the paper are included in the form [page, column (a=left-hand, b=right-hand), paragraph number (including the top)].

    Don't take anything for granted.
    Here we have 4 very intelligent guys analyzing something which they thought was a sure thing, only to find out that they were completely wrong, but coming up with some very powerful conclusions in the process. [23, a, 2-3]
    Find the common path.
    When analyzing code, find the path through the code which will be executed the majority of time -- the "common path." Begin your analysis and improvements there. Optimization will occur much quicker and with better results than if you try a front-to-back approach. [24, a, 4+]
    It's possible to go too-far trying to over optimize or over simplify, often with detrimental effects.
    The example here is the buffer layer -- in the hopes of providing good service both to application clients who want to deal with data one byte at a time, and to those who want to deal in large blocks, the buffer layer became extremely complex. An extreme example of this is one in which of the 68 pages of C code for the protocol, 60 were devoted to the buffer layer and the interfaces to the other layers. That's almost 90% !!!! [ 25, b, 7]
    Ignore system-specific features when analyzing something machine independant, such as a communication protocol.
    Systems will vary, and improvements you might find on one may not be available on others. [26, a, 1 ]
    When designing an O/S, assuming a feature will be rarely used and therefore not worth spending time on can have serious consequences.
    The example of this is in the discussion of O/S timers -- O/S designers who assumed timers would not be used in a context as demanding as TCP (which requires a timer for each packet) and ignored the costs caused the users to pay dearly in overhead spent waiting for the time computation. [26, a, 4]
    Combining caching with common path analysis can make things VERY cheap.
    This builds upon the heuristic that caching is a good idea, but caching when you have a good understanding of what's going on and where it can be most powerful is a great idea. Here they recognized that aprox. 90% of the packets coming in were destined for the same TCP connection as the previously recieved packet. The performance boost through caching of a pointer to the TCB buffer thus achieved hit-rates of 90% or more at very low cost. [26, b, 2]
    If you have to make an estimate, use conservative estimates (worst-case).
    This can only make you look good when you get real data and it comes in under your estimates. Actual numbers smaller than estimations are MUCH better than ones that are larger. [Compare the estimated numbers on pages 24-27 with the actual numbers on page 27-28. One explicit example: 28, a, 2]
    Touching information in memory is VERY expenisve compared to CPU cycles. Combine as many CPU operations on the data you access from memory when you access it to minimize need to re-touch the data later.
    By combining the checksum loop with the memory-memory copy of a packet, they shaved off 25% of the total wait time. [28, b, 3]
    Come back later and test your estimates on a real implementation, include these results in the paper to add credibility.
    After speculating and estimating total instruction counts and times, they actually analyzed to total overhead times on an operational system to test thier hypotheses. The credibility of ALL of the researchers' estimates was greately increased when the actual data supported the estimations for these few cases. [27, b, "A Direct Measure of Protocol Overhead"]
    Don't dismiss something just because it "feels" wrong -- it might actually provide a tremendous reduction in overhead.
    The example here is in their discussion of removal of the DMA controller off of the network controller and using the CPU instead to move the data. It felt completely wrong, but once you realized that you could then bypass kernel buffers alltogether, the overhead decrease was significant. [28, b, 2]
    It is NOT a general rule that by placing software, such as a protocol, into hardware will automatically make it faster.
    While the actual computation will see a speed up when implemented in hardware, the extra overhead required to move the data from memory to hardware, or vice-versa, may dramatically outweigh any speed-up the hardware would provide. [28, b, 6+]