The CS 6120 Course Blog

Dynamic Null Pointer Checks Using LLVM

by Christopher Roman

Preface

This blog post is meant for those with limited knowledge of how to use LLVM to write non-trivial compiler passes. Hopefully this can be a useful resource to those new to LLVM since, in my experience, information on how to use LLVM is relatively sparse. I am using LLVM 9. All of the code can be found at https://github.com/chrisroman/llvm-pass-skeleton.

Overview

The goal of this project is to add null pointer checks before pointer dereferences to avoid segfaults. However, it would be wasteful to add a null pointer check to a pointer dereference when we know for sure the pointer is non-null. For example:

int *p = new int(4120);
// No null check necessary here
*p = 6120;

Thus, we will do a null pointer analysis that determines, at each instruction, which pointers may be null. Using this, we can limit the number of extraneous checks.

To accomplish this, we will create a new compiler pass using LLVM.

Design

Adding the null pointer checks themselves were pretty straightforward. To keep things simple, I decided to just print to tell the user that there was an attempt to dereference a null pointer and then exit(1). Fancier implementations may be able to print useful information so the user doesn't have to gdb everything to see what went wrong.

For the null pointer analysis (NPA henceforth), we aim for soundness rather than completeness. That is, we only mark a pointer as DefinitelyNonNull only if we are guaranteed that the pointer is not null. This way, we are guaranteed that a null check is removed only if it is really safe to do so.

Implementation

Adding the null checks

For null pointer checks, we first created a compiler pass called AddNullCheckFuncPass which adds a new function to a module, which is equivalent to:

void nullcheck(void *p) {
  if (p == nullptr) {
    printf("Found a null pointer. Exiting...\n");
    exit(1);
  }
}

Then we created a compiler pass called AddNullCheckPass. This is a FunctionPass which looks for a LoadInst or StoreInst, as these are the instructions which actually dereference a pointer. Let's look at how a dereference gets translated to LLVM IR:

int *p = ...
*p = 6120

is represented in the IR as:

%p = alloca i32*, align 8
...
%1 = load i32*, i32** %p, align 8
store i32 6120, i32* %1, align 4

We can see that we're storing 6120 into %1, so %1 is what we need to perform the null check on. To do so, we can simply call the nullcheck function that was created in the previous AddNullCheckFuncPass.

Implementing the Null Pointer Analysis

Now, we want to elide null checks provided we can determine that the pointer is non-null at the time of dereferencing. Initially, I figured we could use the existing alias analyses and check if the pointers alias nullptr. However, when I tried this, it would always show as being NoAlias. There is the AAResults::pointsToConstantMemory function, but according to the documentation,

The pointsToConstantMemory method returns true if and only if the analysis can prove that the pointer only points to unchanging memory locations (functions, constant global variables, and the null pointer). This is an issue for two reasons. First, we can't differentiate between pointing to functions, global variables, or the null pointer. Secondly, this only returns true when the pointer only points to one of these constant locations. This means that if a pointer may or may not point to the null pointer, the function would return false. This would be too restrictive for our use case, as we are really looking for pointers that are definitely not null rather than those that are definitely null.

Therfore, to perform this null pointer analysis, we do a dataflow analysis that is similar to CCP (Conditional Constant Propagation). Our lattice elements will be a mapping from pointers in the program to PossiblyNull or DefinitelyNonNull. (We assume PossiblyNull and DefinitelyNonNull also form a lattice with the former as Top and the latter as Bottom.) Our Top value will be a map where everything maps to PossiblyNull, and Bottom is a map where everything maps to DefinitelyNonNull. The analysis is a forward analysis. The Meet operator is simply an elementwise meet on elements in the map. For example, if we have the two lattice elements X = {p: DefinitelyNonNull, q: PossiblyNull} and Y = {p: DefinitelyNonNull, q: DefinitelyNonNull}, then meet(X, Y) = {p: DefinitelyNonNull, q: PossiblyNull}.

Our transfer function is tricky to get right. Consider:

store i32* %p, i32** %q, align 8

Let's call the the memory that %q points to %deref_q. Naively, we would just say that if %p was PossiblyNull, then %deref_q would also be PossiblyNull. While this is necessary, it is not sufficient. Consider the following program:

int *a = new int(6120);
int **p = &a;
*p = nullptr;
*a = 100;

Observe that the line *a = 100 is a null pointer dereference, because p aliases &a. That is, the memory location that p point to and &a point to are the same. Thus, when we do *p = nullptr, we are also setting a to nullptr. This means that whenever we a store a value %p to %deref_q (the location pointed to by %q), we must change everything that aliases %deref_q.

One strange thing about this implementation is how to represent %deref_p. Given store i32* null, i32** %p, align 8, we need to make sure that subsequent loads from %p are PossiblyNull. So, we keep a map deref_map from pointers like %p to an arbitrary new pointer. Therefore, when we see %val = load i32*, i32** %p, align 8, we set lattice[val] = lattice[deref_map[p]], where lattice[x] tells us if x is PossiblyNull or DefinitelyNonNull.

Evaluation

For correctness, I wrote some tests by hand to account for some of the programs shown above. For the most part, the correctness of adding the nullchecks was easy to check. However, the correctness of the NPA was trickier to determine, as shown by the programs that require an alias analysis.

We also want to see how much of an impact these null checks have on the performance of programs. Unfortunately I wasn't able to get the PARSEC benchmark tests running. Instead, I found benchmarks from online which has various benchmarks for different languages. I chose a small portion of these tests to run with and without my compiler passes.

I chose to run:

Here we simply show the mean time taken to run these programs and the standard deviation. There are three columns, representing the three different ways we compiled the program.

Runtime Means and Standard Deviations

BaselineNo NPAWith NPA
binary-trees208 ms ± 4.78 ms217 ms ± 12.8 ms213 ms ± 6.78 ms
fannkuch-redux222 ms ± 5.02 ms536 ms ± 6.34 ms563 ms ± 54.9 ms
fasta200 ms ± 6.01 ms365 ms ± 7.77 ms375 ms ± 28.8 ms
mandelbrot319 ms ± 5.79 ms294 ms ± 2.52 ms298 ms ± 9.76 ms
n-body327 ms ± 16.7 ms733 ms ± 12.4 ms734 ms ± 17.8 ms

Here we can see that in majority of cases, the baseline performs significantly better than the programs with null checks. To me this is a little surprising because I figured the branch predictor would always know that the null checks don't do anything except return from the null checking function.

It is also interesting to note that the NPA didn't cause any improvement! This may be because not that many null pointer checks were removed, perhaps because the analysis is too conservative; this would require further investigation. This is a little disappointing because I spent a lot of time implementing the NPA to reduce the overhead of null checks.

Hardest Parts to Get Right

One of the hardest things of this project was the learning curve of LLVM. The documentation is fairly good, but there's just not much information overall on how to do specific things. For example, I spent a lot of time just figuring out how to make a call to printf in the IR. For some reason, it wouldn't work if printf wasn't already used in the file being compiled.

The other hardest thing was doing the null pointer analysis. It was frustrating to know that I couldn't check if a pointer aliased nullptr. I wasted a lot of time on an incorrect solution that looks as follows:

Create a global pointer that is always nullptr, and replace all instances of nullptr with this global pointer. Then we can use the alias analysis to check what aliases this pointer to see what is potentially null at an instruction. However, by doing this, now all writes to any pointer ends up writing to the location pointed to by the global variable, which is incorrect.

This project gives me a newfound appreciation for compilers writers.

Extras

When trying to debug certain programs, I found that certain variables were being optimized away, which was quite annoying. In searching for a way to prevent this, I found this great talk by Chandler Carruth who discusses microbenchmarking of C++ code. He showed two special functions that can force side effects on variables without actually emitting assembly. See this part of the talk to learn more about it.