If we are trying to prove the correctness of a function with respect to a formal specification, the task can often be broken down into two separate activities: proving partial correctness with respect to a precondition and postcondition, and proving total correctness (halting).
A partial correctness specification for a function is a pair of properties φ (the precondition) and ψ (the postcondition) describing a relation between the inputs and outputs. The precondition φ is a restriction on the inputs. It is the caller's responsibility to ensure that the function is only ever called on inputs satisfying φ. The function then may assume that the inputs satisfy φ (and may wish to check that they do). The postcondition ψ is the function's responsibility. The function must ensure that ψ holds when the function returns, provided the input satisfies φ. Thus the precondition/postcondition pair is like a contract between the caller and the function. A program is partially correct with respect to a partial correctness specification φ, ψ if, whenever the input satisfies the precondition φ, the postcondition ψ holds when the function returns. It does not say that the function must return; but only that if it does, then ψ holds upon return.
In addition to partial correctness, we would also like to ensure that the program halts. The combination of partial correctness and halting is called total correctness. We usually separate the two tasks of proving partial correctness and halting because different techniques are used. For halting, one usually identifies a data value that decreases strictly with each step or each recursive call, but for which it is impossible to decrease infinitely. For example, in the Fibonacci program
let rec fib n = if n <= 1 then n else fib (n - 1) + fib (n - 2)the value of n decreases strictly with each recursive call, but can only decrease finitely many times before the halting condition n <= 1 becomes true. The property that something can only decrease finitely many times is known as well-foundedness. Most of the time the natural numbers suffice for this purpose, but sometimes it is necessary to use more complicated well-founded relations.
The verification process typically uses many tools:
GCD stands for greatest common divisor. The GCD of two integers m and n, denoted GCD(m,n), exists and is unique for all pairs of integers m, n such that m, n ≥ 0 and not both m, n = 0 (this will actually be our precondition on the inputs). We would like to develop a program to compute the GCD of two given integers. We will first develop a formal partial correctness specification φ, ψ. After that, there are two approaches we can take:
Our formal partial correctness specification is
Here are some properties of the integers we will use:
With this definition, we can show that the divisibility relation is a partial order on positive integers as defined above; that is, it is
In this view of divisibility as a partial order, property 3 above says that GCD(m,n) is the greatest lower bound of m and n with respect to the partial order of divisibility.
Lemma 1. Let m ≥ 0 and n > 0. Let q = m/n (integer division), r = m mod n. Then for all k,
k | m and k | n if and only if k | n and k | r.
Proof. Suppose k | m and k | n. Then for some a and b, ka = m and kb = n. Using property 4 above, r = m − qn = ka − qkb = k(a − qb), therefore k | r, and also k | n by assumption. Conversely, if k | n and k | r, then for some c and d, kc = n and kd = r. Again using property 4, m = qn + r = qkc + kd = k(qc + d), therefore k | m, and also k | n by assumption. QED
Lemma 2. Let m ≥ 0 and n > 0. Let q = m/n (integer division), r = m mod n. Then GCD(m,n) = GCD(n,r).
Proof. We have GCD(n,r) | n and GCD(n,r) | r by property 3. Taking k = GCD(n,r) in Lemma 1, we have GCD(n,r) | m and GCD(n,r) | n. Since by property 3 GCD(m,n) is the greatest lower bound of m and n in the order of divisibility, GCD(n,r) | GCD(m,n). A symmetric argument (note the symmetry of Lemma 1) shows that GCD(m,n) | GCD(n,r). Therefore by antisymmetry, GCD(m,n) = GCD(n,r). QED
Now our main constructive existence theorem that will give us the algorithm almost for free is the following.
Theorem. Let m, n ≥ 0, not both = 0. There exist integers s,t such that GCD(m,n) = sm + tn.
Proof. By induction on n. Basis: n = 0. Then GCD(m,n) = GCD(m,0) = m, and we can take s = 1 and t = 0, giving g = m = 1m + 0n.
Induction step: n > 0. Let q = m/n and r = m mod n. By property 4, m = qn + r and 0 &le r < n, thus the precondition holds for n and r. By the induction hypothesis, there exist s' and t' such that GCD(m,n) = GCD(n,r) = s'n + t'r = s'n + t'(m − qn) = t'm + (s' − t'q)n, so we can take s = t' and t = s' − t'q. QED
The program can now be read off directly from the proof of the Theorem.
let rec gcd m n = if n = 0 then (m, 1, 0) else (* from the basis *) let q = m/n in (* n > 0 if we get here *) let r = m mod n in let (g, s', t') = gcd n r in (* from the induction step *) (g, t', s' − t' * q)We have essentially extracted a recursive program from a constructive proof of the existence of an output satisfying the postcondition. The nice thing about this approach is that the program already comes with a proof of partial correctness!
For total correctness, we need only observe that the program always halts for the same reason that the induction in the proof of the Theorem is sound—the second argument to the gcd function decreases strictly with each recursive call and is never negative, therefore must eventually be 0.