VerifyCompilerA Verified Compiler for CS 3110

Source Language

Arithmetic expressions with constants and addition:
    e ::= c | e + e
In OCaml, we could represent these with a data type:
    type expr = 
    | Const of int 
    | Plus of expr × expr
In Coq, they have a very similar representation:

Inductive expr : Type :=
| Const : natexpr
| Plus : exprexprexpr.

In fact, if we extract that Coq expr to OCaml, we get essentially what we expect.

Extraction expr.
(* type expr =
   | Const of nat
   | Plus of expr * expr

The one mismatch is that Coq uses nat, whereas in OCaml we'd normally use int.
  • nat is (theoretically) unbounded and non-negative
  • int is definitely bounded and can be negative.
There is a library called Int31 in Coq that provides the equivalent to OCaml's 31-bit int.

Source: Semantics

The dynamic semantics of expressions is something we can easily implement. Here's a simple interpreter that evaluates expressions:

Fixpoint evalExpr (e : expr) : nat :=
  match e with
    | Const nn
    | Plus e1 e2plus (evalExpr e1) (evalExpr e2)

Again, this extracts to OCaml as we would expect:

Extraction evalExpr.
  let rec evalExpr = function
  | Const n -> n
  | Plus (e1, e2) -> plus (evalExpr e1) (evalExpr e2)

Source: Unit tests of semantics

Here are a couple test cases for our interpreter:

Example source_test_1 : evalExpr (Const 42) = 42.
Proof. reflexivity. Qed.

Example source_test_2 : evalExpr (Plus (Const 2) (Const 2)) = 4.
Proof. reflexivity. Qed.

Target Language

One of the primary tasks of a compiler is to translate from a high-level language to a low-level language. For example,
  • The Java compiler translates from Java to JVM bytecode.
  • The OCaml compiler translates from OCaml to Zinc machine bytecode. []
Both compilers can additionally produce native code that runs on a particular machine architecture.
JVM and OCaml bytecode are both based on a stack machine model, in which a stack is used as the main data structure, rather than a set of registers.

Target: Syntax

So as a target language, let's use the following stack-machine instruction set:
      inst ::= PUSH c | ADD
An inst is a machine instruction. A program prog is a list of instructions.

Inductive inst : Type :=
| PUSH : natinst
| ADD : inst.

Definition prog := list inst.

These extact to OCaml as we would expect.

Extraction inst.
  type inst =
  | PUSH of nat
  | ADD

Extraction prog.
  type prog = inst list

Target: Semantics

To define the dynamic semantics of this target language, we need a notion of a stack:

Definition stack := list nat.

Now it's time to write an interpreter for the target language.
Evaluation of a program takes in an initial stack, and returns the final stack. But since evaluation could fail (if we try to ADD when there aren't at least two values on the stack), we wrap the return in an option, and return None if an error occurs.

Fixpoint evalProg (p : prog) (s : stack) : option stack :=
  match p,s with
    | (PUSH n)::p', sevalProg p' (n::s)
    | ADD::p', x::y::s'evalProg p' ((x+y)::s')
    | [], sSome s
    | _, _None

Extraction of the deep pattern matching doesn't turn out quite so nicely:

Extraction evalProg.
let rec evalProg p s =
  match p with
  | Nil -> Some s
  | Cons (i, p') ->
    (match i with
     | PUSH n -> evalProg p' (Cons (n, s))
     | ADD ->
       (match s with
        | Nil -> None
        | Cons (x, l) ->
          (match l with
           | Nil -> None
           | Cons (y, s') -> evalProg p' (Cons ((plus x y), s')))))

Target: Unit tests

Here are a couple unit tests for the target language interpreter.

Example target_test_1 : evalProg [PUSH 42] [] = Some [42].
Proof. reflexivity. Qed.

Example target_test_2 : evalProg [PUSH 2; PUSH 2; ADD] [] = Some [4].
Proof. reflexivity. Qed.


Now we're ready to translate from the source language to the target language.
  • To translate a constant c, we just push c onto the stack.
  • To translate an addition e1 + e2, we translate e2, translate e1, then append the instructions together, followed by an ADD instruction.

(* returns: compile e produces a program p, such that 
   evaluation of p leaves a single new value at the top 
   of the stack, and that value would be the result of 
   evaluating e. *)

Fixpoint compile (e : expr) : prog :=
  match e with
    | Const n ⇒ [PUSH n]
    | Plus e1 e2compile e2 ++ compile e1 ++ [ADD]

Note that ++ is the Coq append operator, analogous to OCaml's @.
We can extract the compiler to its own file:

Extraction "" compile.

Try using that file in the OCaml REPL!

Compiler: Unit tests

Here are a couple unit tests for our compiler:

Example compile_test_1 : compile (Const 42) = [PUSH 42].
Proof. reflexivity. Qed.

Example compile_test_2 : compile (Plus (Const 2) (Const 2))
  = [PUSH 2; PUSH 2; ADD].
Proof. reflexivity. Qed.

These tests demonstrate that the compiler produces some programs that do seem to correspond to the input expression. But we haven't really tested the postcondition of compile: we want to know whether both side of the = above evaluate to the same value.

Example post_test_1 : evalProg (compile (Const 42)) [] = Some [evalExpr (Const 42)].
Proof. reflexivity. Qed.

Example post_test_2 : evalProg (compile (Plus (Const 2) (Const 2))) []
  = Some [evalExpr (Plus (Const 2) (Const 2))].
Proof. reflexivity. Qed.

So far, so good. But as we know from Dijkstra, "testing can only prove the presence of bugs, never their absence." How could we show that the compiler is correct for every input expression?

Compiler Verification

The following theorem is a specification that says what it means for compile to be correct.

Theorem compile_correct : ∀ e,
  evalProg (compile e) [] = Some [evalExpr e].
  intros; rewrite (app_nil_end (compile e));
  assert (lemma : ∀ e' s p,
    evalProg (compile e' ++ p) s = evalProg p (evalExpr e' :: s)) by
    (induction e'; crush);

Now we have a verified compiler: we have evidence that there cannot be any bugs in the translation. The code we extracted is certified as correct!


CompCert is a certified C compiler.
  • Source language: ISO C 99, mostly.
  • Target language: PowerPC, ARM, x86.
  • Specified, programmed, proved correct in Coq.
  • Not verified: parser, assembler, linker
  • Performance: about 10 percent slowdown compared to gcc -O1.
The main theorem from the CompCert Coq source code:

Theorem transf_c_program_correct:
  forall p tp,
  transf_c_program p = OK tp ->
  backward_simulation (Csem.semantics p) (Asm.semantics tp).


This lecture is inspired by an example in a textbook by Adam Chlipala titled "Certified Programming with Dependent Types".