An Analysis of TCP Processing Overhead
Notes for CS614
Index
Heuristics Employed
I. Background from Keshav
Here is the E-Mail sent out by Prof. Keshav on this paper:
The second paper is a detailed analysis of TCP overhead. Until this
paper was published, people "knew" that TCP was expensive, and
we needed lightweight transport layer protocols. However, Clark
et al showed that, in the common case, both TCP and IP are pretty
cheap to implement. This paper basically killed research into
lightweight transport. As you read the paper, see how they carefully
seperate the work required to perform protocol actions from work
done to support the protocol (i.e. buffer management, timers etc).
The first part is the intrinsic work in TCP, and the latter is
system dependent. The goal is to show that TCP is pretty efficient,
if we ignore system overheads. Of course, we do need to worry about
the implementation environment in a real system. Also, note the
many tricks, mostly having to do with making the common case more
efficient.
II. Quick Overview of the Paper
I found this paper to be extremely well written and very informative.
I feel the paper is straight-forward, thus I will spend little time on
summary and more on the pieces I found interesting -- the underlying heuristics
used to anaylze the TCP protocol.
IIa. Summary
As Prof. Keshav states in his E-Mail, the purpose of this paper is to
research the "known truth" that TCP, the Transport Control Protocol
used in the TCP/IP system of the Internet, is "a likely source of
processing overhead" which could explain the low throughput observed
in TCP/IP.
The conclusion of the researchers was the exact opposite -- they found
that TCP was in fact not the source of the overhead observed in packet
processing, and other aspects of the network were the true culprit.
IIb. Brief Overview of TCP
(More detailed information on the TCP/IP protocol can be found on this
page I found a Yale, or here
at Yahoo.)
As mentioned above, TCP is the transport layer of the protocol of the
Internet, TCP/IP. To understand what role TCP plays in network communication,
it is helpful to view TCP/IP in terms of a protocol reference Model similar
to the OSI model.
Figure 1 -- Communication in TCP/IP
 |
Note that TCP is only invoked at the end nodes, packet routing through
the network is handled only by IP.
The main jobs of TCP are:
- detect and recover lost packets
- perform flow control
- multiplex packets
IIc. Conclusions
Here is a list of the conclusions, along with page references:
- TCP in its current form, if implemented correctly, can perform very
well and offer high throughput. [p. 23]
- Any changes to the TCP protocol might provide some speedup, but any
improvement would be fractional. [p. 25]
- The major culprit of overhead is in the operating system itself [p.
27], which has been observed to be on the order of three times the overhead
of the TCP protocol [p. 27].
- Processing overhead can be substantially reduced through implementation
of header prediction. [p. 25]
- Overhead can be substantially reduced if careful thought is put into
the implementation of the buffer layer. [p. 26]
- The odds that an incoming packet is destined for the same TCP connection
as the previous packet are about 90%. Using this information, you can cache
a pointer to the TCB (Transmission Control Block) used by an incoming packet,
and simply compare that information with the next incoming packet. If the
compare fails, then you perform a search to find the appropriate TCB. This
will save you tremendous overhead. [p.26]
- O/S timers, if not carefully designed, can actually use as much CPU
cycles as TCP. Since a timer is used for every TCP packet, this can be
a major source of overhead. [p. 27]
- Memory accesses, in relation to TCP computing times, are a huge source
of overhead. Combination of tasks such as copying and computing the checksum,
even if it only saves one memory read, can reduce the total overhead by
several factors of the total time it takes for the actual TCP computation.
[p.27]
- Although it seems like a good idea, moving protocols such as TCP onto
silicon to create a hardware protocol processors could have detremental
effects on overhead and provide unnecessary complexity. Software implementations
of TCP will work just fine if they are implemented correctly. [p. 28-29]
III. Heuristics Employed
Although the paper was on the analysis of TCP overhead, I found several
excellent general heuristics used by the authors, and would like to spend
the rest of this summary discussing them. References to examples in the
paper are included in the form [page, column (a=left-hand, b=right-hand),
paragraph number (including the top)].
- Don't take anything for granted.
- Here we have 4 very intelligent guys analyzing something which they
thought was a sure thing, only to find out that they were completely wrong,
but coming up with some very powerful conclusions in the process. [23,
a, 2-3]
- Find the common path.
- When analyzing code, find the path through the code which will be executed
the majority of time -- the "common path." Begin your analysis
and improvements there. Optimization will occur much quicker and with better
results than if you try a front-to-back approach. [24, a, 4+]
- It's possible to go too-far trying to over optimize or over simplify,
often with detrimental effects.
- The example here is the buffer layer -- in the hopes of providing good
service both to application clients who want to deal with data one byte
at a time, and to those who want to deal in large blocks, the buffer layer
became extremely complex. An extreme example of this is one in which of
the 68 pages of C code for the protocol, 60 were devoted to the buffer
layer and the interfaces to the other layers. That's almost 90% !!!! [
25, b, 7]
- Ignore system-specific features when analyzing something machine
independant, such as a communication protocol.
- Systems will vary, and improvements you might find on one may not be
available on others. [26, a, 1 ]
- When designing an O/S, assuming a feature will be rarely used and
therefore not worth spending time on can have serious consequences.
- The example of this is in the discussion of O/S timers -- O/S designers
who assumed timers would not be used in a context as demanding as TCP (which
requires a timer for each packet) and ignored the costs caused the users
to pay dearly in overhead spent waiting for the time computation. [26,
a, 4]
- Combining caching with common path analysis can make things VERY
cheap.
- This builds upon the heuristic that caching is a good idea, but caching
when you have a good understanding of what's going on and where it can
be most powerful is a great idea. Here they recognized that aprox. 90%
of the packets coming in were destined for the same TCP connection as the
previously recieved packet. The performance boost through caching of a
pointer to the TCB buffer thus achieved hit-rates of 90% or more at very
low cost. [26, b, 2]
- If you have to make an estimate, use conservative estimates (worst-case).
- This can only make you look good when you get real data and it comes
in under your estimates. Actual numbers smaller than estimations are MUCH
better than ones that are larger. [Compare the estimated numbers on pages
24-27 with the actual numbers on page 27-28. One explicit example: 28,
a, 2]
- Touching information in memory is VERY expenisve compared to CPU
cycles. Combine as many CPU operations on the data you access from memory
when you access it to minimize need to re-touch the data later.
- By combining the checksum loop with the memory-memory copy of a packet,
they shaved off 25% of the total wait time. [28, b, 3]
- Come back later and test your estimates on a real implementation,
include these results in the paper to add credibility.
- After speculating and estimating total instruction counts and times,
they actually analyzed to total overhead times on an operational system
to test thier hypotheses. The credibility of ALL of the researchers'
estimates was greately increased when the actual data supported the estimations
for these few cases. [27, b, "A Direct Measure of Protocol Overhead"]
- Don't dismiss something just because it "feels" wrong
-- it might actually provide a tremendous reduction in overhead.
- The example here is in their discussion of removal of the DMA controller
off of the network controller and using the CPU instead to move the data.
It felt completely wrong, but once you realized that you could then bypass
kernel buffers alltogether, the overhead decrease was significant. [28,
b, 2]
- It is NOT a general rule that by placing software, such as a protocol,
into hardware will automatically make it faster.
- While the actual computation will see a speed up when implemented in
hardware, the extra overhead required to move the data from memory to hardware,
or vice-versa, may dramatically outweigh any speed-up the hardware would
provide. [28, b, 6+]