Lecture 15: Verification

We will use the term verification to refer to a process that generates high assurance that code works on all inputs and in all environments. Testing is a good, cost-effective way of getting assurance, but it is not a verification process in this sense because there is no guarantee that the coverage of the tests is sufficient for all uses of the code. Verification generates a proof (sometimes only implicitly) that all inputs will result in outputs that conform to the specification. In this lecture, we look at verification based on explicitly but informally proving correctness of the code.

Verification tends to be expensive and to require thinking carefully about and deeply understanding the code to be verified. In practice, it tends to be applied to code that is important and relatively short. Verification is particularly valuable for critical systems where testing is less effective. Because their execution is not deterministic, concurrent programs are hard to test, and sometimes subtle bugs can only be found by attempting to verify the code formally. In fact, tools to help prove programs correct have been getting increasingly effective and some large systems have been fully verified, including compilers, processors and processor emulators, and key pieces of operating systems.

Another benefit to studying verification is that when you understand what it takes to prove code correct, it will help you reason about your own code (or others') and to write code that is correct more likely to be correct, based on specs that are more precise and useful.

In recent years, techniques have been developed that combine ideas from verification and testing have been developed that can sometimes give the best of both worlds. These ideas, model checking and abstract interpretation, can give the same level of assurance as formal verification at lower cost, or more assurance than testing at similar cost. However, in this lecture, we'll look at verification in the classic sense.

A Simple Example

Let's prove a short piece of code correct in a slightly informal but hopefully convincing way. Here is a slightly odd implementation of the max function on integers, using abs, whose spec is also given:

(* Returns: max x y is the maximum of x and y.  That is,
 * ((max x y = x) or (max x y = y)) and (max x y >= x) and (max x y >= y)
 *
 * Requires: Both x and y are between min_int/2 and max_int/2
 *)
let max x y = (x + y + abs(y-x))/2

(* Returns: abs x  is x if x >= 0, -x otherwise. *)
val abs : int -> int

Because this implementation doesn't use if (assuming abs doesn't), it's conceivable that this could be faster than the obvious implementation!

To verify a function like this max, we'll use two ideas. First, we'll consider the possible executions of the code starting from an arbitrary call satisfying the precondition. As execution proceeds, we'll accumulate facts we know about the current execution. Second, at some points during execution we won't know enough about what is supposed to happen, and we'll have to consider some possible cases. We have to argue that the postcondition of the function is satisfied no matter which case is taken.

We start by considering an evaluation of max x y, where x and y represent are integers satisfying the precondition. We record the precondition as an assumption in the right column. As a shorthand, we'll use PRE to represent the precondition, (min_int ≤ x ≤ max_int and min_int ≤ y ≤ max_int). One thing we need to know is that the OCaml operators + and − act like mathematical addition and subtraction as long as the mathematical result is within the range min_int to max_int:

   (* (+) x y is x + y.
      Requires: min_int ≤ x + y ≤ max_int *)
   (* (−) x y is x − y.
      Requires: min_int ≤ x − y ≤ max_int *)

Now we proceed with the proof, which is organized as a table in which each row represents another evaluation step.

Expression Assumptions Justification

max x y PRE We consider an arbitary legal application of max.

(x+y+abs(y-x))/2 PRE Substitute the actuals for the formals in the function body.

(n1+abs(y-x))/2 PRE
n1 = x + y x+y evaluates to some number n1. PRE says that x and y are both less than half of min_int or max_int, so their sum must be a valid integer; therefore there can't be overflow when they are added.

(n1+abs(n2))/2 PRE
n1 = x + y
n2 = y − x y−x evaluates to some number n2. Again, we can't have an overflow doing this subtraction. But now we don't know the result of abs. We use the spec of abs, but separately consider the cases where y ≥ x and where y < x.

Case: y ≥ x

(n1+n2)/2 PRE
n1 = x + y
n2 = y − x
y≥x Since y≥x, abs(n2) = n2.

n3/2 PRE
n1 = x + y
n2 = y − x
y≥x
n3 = n1+n2 = 2y n1+n2 = x+y + (y−x) = 2y. Because y ≤ max_int/2, we can't get an overflow here either.

y PRE
y≥x n1+n2 = x+y + (y−x) = 2y. Since y≥x, the answer we got (y) is the correct one according to the spec. Now we consider the other case, which is symmetrical.

Case: y < x

(n1+n2)/2 PRE
n1 = x + y
n2 = y − x
y<x Since y<x, abs(n2) = −n2 = x−y

n3/2 PRE
n1 = x + y
n2 = y − x
y<x
n3 = n1−n2 = 2x n1+n2 = x+y - (y−x) = 2x. Since x ≤ max_int/2, we can't get an overflow here either.

x PRE
y<x n1+n2 = x+y - (y−x) = 2x. Because y<x, the answer we got (x) is the correct one according to the spec.

QED

Expression	Assumptions	Justification
`max x y`	PRE	We consider an arbitary legal application of `max`.
`(x+y+abs(y-x))/2`	PRE	Substitute the actuals for the formals in the function body.
`(n1+abs(y-x))/2`	PRE n1 = x + y	x+y evaluates to some number n1. PRE says that x and y are both less than half of min_int or max_int, so their sum must be a valid integer; therefore there can't be overflow when they are added.
`(n1+abs(n2))/2`	PRE n1 = x + y n2 = y − x	y−x evaluates to some number n2. Again, we can't have an overflow doing this subtraction. But now we don't know the result of `abs`. We use the spec of `abs`, but separately consider the cases where y ≥ x and where y < x.
Case: y ≥ x
`(n1+n2)/2`	PRE n1 = x + y n2 = y − x y≥x	Since y≥x, abs(n2) = n2.
`n3/2`	PRE n1 = x + y n2 = y − x y≥x n3 = n1+n2 = 2y	n1+n2 = x+y + (y−x) = 2y. Because y ≤ max_int/2, we can't get an overflow here either.
`y`	PRE y≥x	n1+n2 = x+y + (y−x) = 2y. Since y≥x, the answer we got (y) is the correct one according to the spec. Now we consider the other case, which is symmetrical.
Case: y < x
`(n1+n2)/2`	PRE n1 = x + y n2 = y − x y<x	Since y<x, abs(n2) = −n2 = x−y
`n3/2`	PRE n1 = x + y n2 = y − x y<x n3 = n1−n2 = 2x	n1+n2 = x+y - (y−x) = 2x. Since x ≤ max_int/2, we can't get an overflow here either.
`x`	PRE y<x	n1+n2 = x+y - (y−x) = 2x. Because y<x, the answer we got (x) is the correct one according to the spec.
QED

The rule of the game here is that we can make an evaluation step on the left column as long as we have enough information in our assumptions to know what evaluation step is taken. And we can introduce fresh variables like n1, n2, n3 to represent unknown values that are computed. We can apply functions, such as abs, if their preconditions are satisfied. If we don't have enough information in our current set of assumptions (the middle column), then we can break our analysis into a set of cases as long as the cases are exhaustive. An analysis by cases acts like a stack: each case is a separate assumption or assumptions that create a hypothetical world that we must reason about. The indenting in the table above indicates when we are in such a hypothetical world. Once we arrive at a result, we need to be able to show that result satisfies the postcondition, using information available in the current set of assumptions.

Modular Verification

In our proof that max met its spec, we assumed that the abs met its spec. This was good because we didn't have to look at the code of abs. This was an example of modular verification, in which we were able to verify one small unit of code at a time. Function specifications, abstraction functions, and rep invariants make it possible to verify modules one function at a time, without looking at the code of other functions. If modules have no cyclic dependencies, which OCaml enforces, then the proof of each module can be constructed assuming that every other module satisfies the specs in its interface. The proof of the entire software system is then built from the bottom up.

Recursive Functions

How do we handle recursive functions? We'll use the approach of assuming that a recursive function satisfies its spec whenever it is called recursively. A proof of correctness will then ensure the partial correctness of the function's implementation, which we can distinguish from total correctness:

Partial correctness: whenever the function is applied with the precondition satisfied, and it terminates, it produces results satisfying the postcondition.
Total correctness: the function has partial correctness, and in addition, it always terminates when the function is applied.

The nice thing about this approach is that we can separate the problem of proving that a function computes the right answer from the problem of proving that the function terminates.

Consider the following recursive implementation of a function lmax that computes the maximum element of a nonempty list:

(* Returns: lmax xs  is the maximum element in the list xs.
 * Checks: xs != []
 *)
let rec lmax (xs : int list) : int =
    match xs with
        [] -> raise (Failure "requires clause of max violated")
      | [x] -> x
      | x :: t -> max x (lmax t)

Let's prove partial correctness. First of all, we need to understand our postcondition a little more precisely. What are the elements of a list? We can define a function that gives us the elements by induction on the list:

elements([]) = ∅
elements(h :: t) = {h} ∪ elements(t)

Like an abstraction function, this function maps from the concrete domain of OCaml lists to a more abstract mathematical domain, that of sets.

Expression Assumptions Justification

lmax xs xs ≠ [] Consider an arbitrary call satisfying the precondition.

match xs with []-> ... xs ≠ [] Expand the function body. Now we need to create cases to know which way the match will go. Here are three exhaustive cases: xs = [], xs = [x], xs = x :: t.

Case xs = []

raise (Failure ...) xs ≠ [] This case can't happen according to our assumptions. The postcondition is vacuously satisfied.

Case xs = [x] = x :: []

x xs = [x] The result is the only element in xs, so elements(xs) = {x}, and therefore x is the maximum element in elements(xs).

Case xs = x::t and t ≠ []

max x (lmax t) xs = x::t
t ≠ [] Now lmax can be applied to t; crucially, t is not the empty list, so the precondition of lmax is satisfied.

max x n1 xs = x::t
t ≠ []
n1 = maximum of elements(t) Now we can apply the function max, using its spec, obtaining some value n2 that satisfies the postcondition of max.

n2 xs = x::t
t ≠ []
n1 = maximum of elements(t)
(n2 = x or
n2 = n1)
n2 ≥ x
n2 ≥ n1 From the mathematical definition of the maximum element of a set, the value n1 must be an element of t and must be at least as large as any element of t. The specification says that n2 must be an element of xs and at least as large as any element of xs. We know from the definition of elements that elements(xs) = {x}∪elements(t). So n2 must be an element of xs because it's either x or it's n1, which is an element of t. Further, n2 must be at least as large as x (n2 ≥ x). And it must be at least as large as elements(t), since it's at least as large as n1, which is at least as large as any element of t. Therefore n2 is the maximum element of xs in this case.

QED

Expression	Assumptions	Justification
`lmax xs`	xs ≠ []	Consider an arbitrary call satisfying the precondition.
`match xs with []-> ...`	xs ≠ []	Expand the function body. Now we need to create cases to know which way the `match` will go. Here are three exhaustive cases: xs = [], xs = [x], xs = x :: t.
Case xs = []
`raise (Failure ...)`	xs ≠ []	This case can't happen according to our assumptions. The postcondition is vacuously satisfied.
Case xs = [x] = x :: []
`x`	xs = [x]	The result is the only element in `xs`, so elements(xs) = {x}, and therefore x is the maximum element in elements(xs).
Case xs = x::t and t ≠ []
`max x (lmax t)`	xs = x::t t ≠ []	Now `lmax` can be applied to t; crucially, t is not the empty list, so the precondition of `lmax` is satisfied.
`max x n1`	xs = x::t t ≠ [] n1 = maximum of elements(t)	Now we can apply the function `max`, using its spec, obtaining some value n2 that satisfies the postcondition of `max`.
`n2`	xs = x::t t ≠ [] n1 = maximum of elements(t) (n2 = x or n2 = n1) n2 ≥ x n2 ≥ n1	From the mathematical definition of the maximum element of a set, the value n1 must be an element of t and must be at least as large as any element of t. The specification says that n2 must be an element of xs and at least as large as any element of xs. We know from the definition of elements that elements(xs) = {x}∪elements(t). So n2 must be an element of xs because it's either x or it's n1, which is an element of t. Further, n2 must be at least as large as x (n2 ≥ x). And it must be at least as large as elements(t), since it's at least as large as n1, which is at least as large as any element of t. Therefore n2 is the maximum element of xs in this case.
QED

Total Correctness

The key to proving total correctness is to prove that recursion cannot go on forever. We need to be able to map the function arguments onto a set that has a least element. Typically we do this by giving a decrementing function d(x) that maps the function argument x onto the natural numbers. The decrementing function d has two properties:

It maps any function arguments that satisfy the precondition to a natural number (≥ 0):
PRE ⇒ 0 ≤ d(x)
Whenever there is a recursive call, the decrementing function applied to the arguments x' of the recursive call is strictly smaller:
0 ≤ d(x') < d(x).

This means that when the decrementing function is zero, there is no recursive call.

These conditions ensure that the decrementing function keeps getting smaller on every recursive call, but cannot get smaller forever.

For example, in lmax an appropriate decrementing function is: d(x) = List.length(x) − 1. It must be nonnegative; when it is zero, the function terminates; and the recursive call to lmax is on a shorter list t.