A5: Prove it!

# A5: Prove it! **Deadline:** Monday, 11/21/16, 11:59 pm. Because of Thanksgiving Break, the following late penalties will be in effect: Tuesday -10%, after Tuesday -100%. *This assignment is to be done as individuals, not with partners nor with teams. Sharing of solutions is not permitted, especially not between you and your project team.* ## Overview This assignment is a collection of written problems about correctness. There are no programming problems on this assignment. Depending on the kind of problems you prefer, it might take you longer or shorter than a programming assignment. ## Objectives * Prove the correctness of functions. * Learn how to prove the correctness of data structure operations. * Practice proof by induction. * Write a monad and prove that it satisfies the monad laws. ## Recommended reading * [Lectures and recitations 19–23][web] [web]: http://www.cs.cornell.edu/Courses/cs3110/2016fa/ ## What we provide * Only this writeup. ## What to turn in Submit the following files on [CMS][cms]: * `a5solution.pdf` or `a5solution.txt`, containing your solution. You are welcome to use plain text, LaTeX, or any other word processing system to produce your solution. Please do not scan handwritten solutions. * optionally, `karma.zip`, containing your solution to karma problems. [cms]: https://cms.csuglab.cornell.edu/ ## Grading issues * **Late submissions:** Carefully review the [course policies][syllabus] on submission and late assignments. Verify before the deadline on CMS that you have submitted the correct version. * **Clarity matters:** Your written solutions will be evaluated on clarity. A "correct" solution that a grader can't understand is incorrect. [syllabus]: http://www.cs.cornell.edu/Courses/cs3110/2016fa/syllabus.php ## Git As always, you should backup your files, and using git is a good way to do that. We won't ask you to turn in a git log for this assignment. But we also won't have sympathy for anyone who loses their solution because they weren't keeping it backed up. ## Problem 1: Correctness of data structures In this problem, we'll look at proving the correctness of entire data structures, not just individual functions. Consider data structures of the following form: ``` sig (* RI: Inv(v) must hold for v : t *) type t val value : t val constructor : 'a -> t val destructor : t -> 'b val transform : 'c -> t -> t end ``` A *value* of the data structure is a value of the representation type `t`. A *constructor* is an operation that can produce values of type `t` out of some type `'a`. (Note, this use of the word "constructor" is distinct from yet closely related to the use of that word with variant types in OCaml.) A *destructor* is an operation that can turn a value of type `t` into a value of some type `'b`. And a *transform* is an operation that transforms one value of the data structure into another. The tranform might use an additional input of some type `'c`, or there might not be any additional input required. The `list` data structure can be understood in this way: * `[]` is a value. * `hd` and `length` are destructors. * `tl` and `rev` are transforms. * Although there is no function in the `List` module that behaves exactly like a constructor, we could think of the syntactic sugar `[ e ]` as being a kind of constructor. When proving the correctness of data structures, the *representation invariant* is important: * for any module value `v:t`, we prove `Inv(v)`. * for any constructor `c` that takes input `x`, we add `Inv(c x)` to the postcondition of `c`. * for any destructor `d` that takes value `v:t` as input, we add `Inv(v)` to the precondition of `d`. * for any transform `t` that takes value `v1` as input and produces value `v2` as output, we add `Inv(v1)` to the precondition and `Inv(v2)` to the postcondition of `t`. Let's make that concrete by proving the correctness of sets represented as lists without duplicates: ``` module type SET = sig type 'a set val mem : 'a -> 'a set -> bool (* precondition: none * postcondition: for all x, mem x empty ~ false *) val empty : 'a set (* precondition: none * postcondition: mem x (add x s) ~ true *) val add : 'a -> 'a set -> 'a set (* precondition: none * postcondition: size empty ~ 0 && for all x, mem x s ~ true implies size (add x s) ~ size s && for all x, mem x s ~ false implies size (add x s) ~ 1 + size s *) val size : 'a set -> int val rem : 'a -> 'a set -> 'a set val union : 'a set -> 'a set -> 'a set val inter : 'a set -> 'a set -> 'a set end module ListSet : SET = struct (* RI: the list contains no duplicates *) type 'a set = 'a list let mem = List.mem let empty = [] let add x l = if mem x l then l else x::l let size l = List.length l let rem x l = List.filter ((<>) x) l let union l1 l2 = List.fold_left (fun a x -> if mem x l2 then a else x::a) l2 l1 let inter l1 l2 = List.filter (fun h -> mem h l2) l1 end ``` We can make the rep invariant precise by writing an OCaml function that checks it: ``` let rec inv = function | [] -> true | h::t -> not (mem h t) && inv t ``` *Hint: none of the proofs below needs induction.* ### Part A `empty` is a value of type `'a set`. We need to show that it satisfies its specification, including the rep invariant. When we add the rep invariant to the specification, we obtain ``` precondition: none postcondition: (for all x, mem x empty ~ false) && (inv empty ~ true) ``` So **prove two theorems:** ``` Theorem. for all x, mem x empty ~ false. Theorem. inv empty ~ true. ``` Use the following as the specification for `List.mem`: ``` (* precondition: none * postcondition: for all x, mem x [] ~ false && for all x and l, mem x (x::l) ~ true && for all x, h, and t, if x <> h then mem x (h::t) ~ mem t *) val mem : 'a -> 'a list -> bool ``` *Note that this specification of `List.mem` is in fact not 100% accurate. Because `List.mem` is implemented with `Pervasives.compare`, and because `Pervasives.compare` is not total—it raises an exception when applied to functions and it diverges when applied to cyclic values—`List.mem` really needs a stronger precondition. Also, `List.mem` itself would diverge when applied to a cyclic list. We will ignore those issues in this problem.* ### Part B `add` is a transform of type `'a -> 'a set -> 'a set`. We need to show that it satisfies its specification, including the rep invariant. When we add the rep invariant to the specification, we obtain ``` precondition: inv s ~ true postcondition: (mem x (add x s) ~ true) && (inv(add x s) ~ true) ``` So **prove two theorems:** ``` Theorem. for all x and s, (inv s ~ true) implies (mem x (add x s) ~ true). Theorem. for all x and s, (inv s ~ true) implies (inv(add x s) ~ true). ``` ### Part C `size` is a destructor of type `'a set -> int`. We need to show that it satisfies its specification, including the rep invariant. **State and prove a theorem (or theorems)** that do so. Use the following as the specification for `List.length`: ``` (* precondition: none * postcondition: length [] ~ 0 and length (h::t) ~ 1 + length t *) val length : 'a list -> int ``` *As in Part A, this specification of `List.length` is in fact not 100% accurate. `List.length` would diverge when applied to a cyclic list. We will ignore that issue in this problem.* ## Problem 2: Proof by induction Here is a type that represents a small language of integer expressions: ``` type expr = | Int of int (* integer constants *) | Neg of expr (* unary negation *) | Plus of expr * expr (* addition *) ``` ### Part A State the induction principle for `expr`. ### Part B Here is code that implements evaluation of expressions, as well as an *optimizer* that removes certain subexpressions that never need to be evaluated: ``` (* [eval e] is the integer to which [e] evaluates. *) let rec eval = function | Int n -> n | Neg e -> -(eval e) | Plus(e1,e2) -> (eval e1) + (eval e2) (* [opt e] is an expression that evaluates to the same value as [e], * but does not syntactically contain any subexpressions of the form 0+e'. *) let rec opt = function | Int n -> Int n | Neg e -> Neg (opt e) | Plus(e1,e2) -> let e1' = opt e1 in if e1'=Int 0 then opt e2 else Plus(e1',opt e2) ``` **Formally restate** the informal specification given above for `opt` and **prove the correctness** of the function. *Hint 1: define a function `no_zero_plus` that formalizes what it means to "not syntactically contain any subexpressions of the form `0+e'`". For example, `no_zero_plus (Plus(Int 0,Int 1))` would be `false`, and `no_zero_plus (Plus(Plus(Int 1,Neg(Int 1)), Int 2))` would be `true`. Use that function in your formal specification, much like we used `mem` and `inv` in Problem 1.* *Hint 2: you will likely need to nest a proof by case analysis inside a proof by induction.* ## Problem 3: Monads In lecture, we studied a monad for debuggable functions, which had the *effect* of producing a debug string in addition to computing a value. That monad made it easy for us to pipeline debuggable functions. Suppose that instead of producing a single string, we wanted debuggable functions to produce a list of strings. Think of that list as being like a log: as the computation of the function proceeds, it appends entries to the end of the log. ### Part A **Implement the `Logger` module** by completing the structure below. Your implementation should be linear time or better. ``` module type Monad = sig type 'a t val (>>=) : 'a t -> ('a -> 'b t) -> 'b t val return : 'a -> 'a t end module type LogMonad = sig include Monad val log : string list -> unit t end module Logger : (LogMonad with type 'a t = 'a * string list) = struct (* TODO *) end ``` Your implementation should be compatible with the following code, which logs the computation of a greatest common divisor function: ``` open Logger let rec gcdlog a b = let str = string_of_int in if b = 0 then log ["gcd is " ^ str a] >>= fun () -> return a else let s = str a ^ " mod " ^ str b ^ " = " ^ str (a mod b) in log [s] >>= fun () -> gcdlog b (a mod b) ``` Sample usage: ``` # gcdlog 31 10;; - : int t = (1, ["31 mod 10 = 1"; "10 mod 1 = 0"; "gcd is 1"]) # gcdlog 20 15;; - : int t = (5, ["20 mod 15 = 5"; "15 mod 5 = 0"; "gcd is 5"]) ``` ### Part B **Prove that the monad laws hold** for your implementation of `Logger`. You may use without proof any theorems and lemmas about `@` that we have stated in recitation. ``` LAW 1: return x >>= f ~ f x LAW 2: m >>= return ~ m LAW 3: (m >>= f) >>= g ~ m >>= (fun x -> f x >>= g) ``` *Hint: these proofs should be relatively simple calculations; no induction is needed.* ## Karma You can submit your karma solutions as part of an optional `karma.zip` file in CMS. <img src="camel_orange.png" height="40" width="40"> <img src="camel_black.png" height="40" width="40"> <img src="camel_black.png" height="40" width="40"> **Coq:** [Coq][coq] is a proof assistant that has been used to prove the correctness of compilers and to formalize a great deal of mathematics. The first three chapters of the [*Software Foundations*][sf] textbook by Benjamin C. Pierce et al. teach some functional programming in Coq and include exercises about the correctness of functions on lists. Download the textbook and complete the 1 through 3 star non-optional exercises in those chapters. Submit your Coq source files. [coq]: https://coq.inria.fr/ [sf]: https://www.cis.upenn.edu/~bcpierce/sf/current/index.html <img src="camel_orange.png" height="40" width="40"> <img src="camel_orange.png" height="40" width="40"> <img src="camel_black.png" height="40" width="40"> **Mergesort:** Submit a solution to the challenge exercise on mergesort in [this recitation](http://www.cs.cornell.edu/Courses/cs3110/2016fa/l/21-specifications/rec.html). <img src="camel_orange.png" height="40" width="40"> <img src="camel_orange.png" height="40" width="40"> <img src="camel_orange.png" height="40" width="40"> **Yup:** [Yup][yup] is a relatively new proof assistant that performs automatic checking of simple program-correctness proofs of functional programs through inductive and equational reasoning. Try to formalize some of the proofs from lecture, recitation, and this assignment using it. Submit your proof scripts. We honestly don't know how hard or even possible this will be, as none of the course staff has ever used it! [yup]: https://github.com/LaifsV1/ProofChecker