A Static Buffer Overrun Detector for C


Motivation

Robert Morris, a former Cornell graduate student, wrote and released the original Internet worm in 1988. One method of transmission for his worm was a buffer overrun in fingerd. More recent worms spread by buffer overruns include blaster, nimda, slammer, and code red. Buffer overrun exploits typically enable an attacker to execute arbitrary code (i.e., do anything they want if it's a root process). Overruns continue to show up in both open and closed source software.

Objective

To create a static buffer overrun detection system for C source code with three goals:

  1. Safety. Be conservative within reason. We want a low false negative rate where a false negative is defined as a real buffer overrun that we did not flag as a possible buffer overrun. The buffer overrun detector is already safe.
  2. Low False Positive Rate. We want to reduce the time spent by developers checking their code. A false positive is defined as something that we flag as a possible buffer overrun which was actually safe. False positives are our biggest problem at present.
  3. Automation. The user does not need to annotate their code in order to achieve useful results. The tool should also assist the user in determining whether a warning is a genuine buffer overrun or a false positive.

Static vs Dynamic Methods

Static methods analyze the source code without running it wheras dynamic methods watch the code as it runs or insert additional instructions into the code. Each strategy has its drawbacks and advantages.

Statically detecting where buffer overruns can occur in C code without any possibility of false positives or false negatives is computationally intractable. Therefore, most static methods attempt to make a safe and/or useful approximation. A safe approximation produces very few (ideally 0, but that is difficult with some languages) false negatives. False positives are common with static methods that are safe. More precise static methods are usually more computationally expensive.

More popular static methods include tools like (s)lint and even grep. These tools can catch some problems with only a lexical analysis--for instance, calls to gets. However, lint and grep are far from safe. There are many more sophisticated algorithms out there, but they are beyond the scope of this document.

Dynamic methods use extra checks at run-time to either prevent overruns from occurring or to detect them after the fact. Usually, the offending program is terminated (and the incident may be logged) for lack of anything better to do. Sometimes, an exception may be thrown. The primary weakness of dynamic methods is their inability to check all possible executions of the program for problems. Dynamic methods only detect problems in the current execution. Java and StackGuard both qualify as dynamic methods for detecting buffer overruns (only StackGuard will help with C code, of course). Languages like Java are safe (in theory), but tools like StackGuard are not foolproof.


The Tool

The tool is a result of a series of incremental improvements over David Wagner's approach to detecting buffer overruns [3]. The two largest differences are:

  1. We make use of pointer analysis results.
  2. We are (partially) flow sensitive.

Architecture

  1. Build the project using CodeSurfer [5]. CodeSurfer provides an intermediate representation including flow and context insensitive pointer analysis results and an interprocedural mod analysis.
  2. Create a linear program (LP) by generating constraints from the intermediate representation of the entire program. Variables in the LP describe the following at various program points:
  3. Use an LP solver to solve for the tightest solution to the LP. The LP is constructed using the language accepted by David Wagner's LP solver [3]. We use a modified version of this solver to find the solution.
  4. Examine the results from the solver and classify reads and/or writes as dangerous where it may be possible to read or write past the end of an array or heap variable.
  5. Present the results in a GUI prioritized and grouped using various heuristics. Double clicking on a warning navigates to the offending source code. From there, the user can use CodeSurfer [5] to help them understand whether the code can infact generate a buffer overrun. Suggestions for improving the functionality of the interface are welcome.

Above is the tool's analysis of wu-ftpd, a popular open-source FTP server. The top pane shows a list of potential overruns grouped by sets of buffers that may be overrun. For example, the line above the highlighted line indicates that the buffer configdir may overflow because it is allocated exactly 4095 bytes but between 0 and 8192 bytes may be written to it. (Note that accesspath is an alias for configdir, resolved using pointer analysis.) This is a real buffer overrun vulnerability in wu-ftpd that was first identified by our buffer overrun detector.

Sources of Precision and Imprecision

Pointer Analysis

A pointer analysis determines where pointers may point. It is an asymmetric binary relation between variables. At present, the tool is using a flow-insensitive and context-insensitive pointer analysis. This means that the pointer analysis treats all possible orderings of statements equally and essentially does not distinguish between different invocations of the same procedure (this is an oversimplification). Why not use something better? We're thinking about it--but it can be computationally expensive.

Flow Sensitivity in the Buffer Overrun Analysis

Context Sensitivity in the Buffer Overrun Analysis

Some other sources of imprecision include dead code, infeasible paths, and non-linear integer computations.


Results

The tool (in various stages of development) has been used to detect about 20 low-risk buffer overruns in wu-ftpd. We have also confirmed that the tool finds known buffer overruns in applications including: sendmail, talkd, CLIPS, and strace.

Note: As far as we know, there are no false negatives in any of the benchmarks. However, it is possible to engineer a false negative in the C language. So there may be false negatives we do not know about.

Benchmark Flow Sensitivity Enabled? False Negatives True Negatives False Positives True Positives True Positive Rate ( True Positives / Positives )
wu-ftpd-2.6.2 Yes, some conditionals 0? 1642 322 23 6.67%
wu-ftpd-2.6.2 Yes, but no conditionals 0? 1629 335 23 6.42%
wu-ftpd-2.6.2 No 0? 1604 360 23 6.01%

References


Parties Presently Involved

Dave Vitek, Radu Rugina, Grammatech Inc.
Dave Vitek
Last modified: Mon Mar 1 15:10:50 EST 2004