A more realistic language: IMP op in Op ::= + | * | <= e in Exp ::= x | i | true | false | e1 op e2 c in Com ::= x := e | skip | c1;c2 | if e then c1 else c2 | while e do c s in Store : Var -> Int Two kinds of configuration: (e,s) and (c,s) Two transition relations: (e,s) => (e',s) and (c,s) => (c',s') Expressions: var ----------------- (x,s) => (s(x),s) plus ------------------ (i = i1+i2) (i1+i2,s) => (i,s) leq ------------------------ (i1 <= i2) (i1 <= i2,s) => (true,s) gt ------------------------- (i1 > i2) (i1 <= i2,s) => (false,s) times ------------------- (i = i1*i2) (i1*i2,s) => (i,s) (e1,s) => (e1',s') left ------------------------------ (e1 op e2,s) => (e1' op e2,s') (e2,s) => (e2',s') right ----------------------- (i+e2,s) => (i op e2,s') Commands: set ------------------------------ (x := i, s) => (skip, s[x->i]) (e,s) => (e',s') assign ------------------------- (x := e,s) => (x := e',s') seq-skip --------------------- (skip;c, s) => (c, s) seq (c1,s) => (c1',s') ------------------------- (c1;c2, s) => (c1';c2, s) if-true ------------------------------------- (if true then c1 else c2,s) => (c1,s) if-false -------------------------------------- (if false then c1 else c2,s) => (c2,s) (e,s) => (e',s') if ----------------------------------------------------- (if e then c1 else c2,s) => (if e' then c1 else c2, s') while --------------------------------------------------------------- (while e do c, s) => (if e then (c; while e do c) else skip, s) ------------------------------------------------------------------- Evaluation of a simple program in some arbitrary store s: (x := 42; y := 30; while (y <= x) do { y := y + 13 }, s) => (y := 30; while (y <= x) do { y := y + 13 }, s[x->42]) => (while (y <= x) do { y := y + 13 }, s[x->42,y->30]) => (if (y <= x) then { y := y + 13; while (y <= x) do { y := y + 13} } else skip, s[x->42,y->30]) => (if (30 <= x) then { y := y + 13; while (y <= x) do { y := y + 13} } else skip, s[x->42,y->30]) => (if (30 <= 42) then { y := y + 13; while (y <= x) do { y := y + 13} } else skip, s[x->42,y->30]) => (if (true) then { y := y + 13; while (y <= x) do { y := y + 13} } else skip, s[x->42,y->30]) => (y := y + 13; while (y <= x) do { y := y + 13}, s[x->42,y->30]) => (y := 30 + 13; while (y <= x) do { y := y + 13}, s[x->42,y->30]) => (y := 43; while (y <= x) do { y := y + 13}, s[x->42,y->30]) => (while (y <= x) do { y := y + 13}, s[x->42,y->43]) => (if (y <= x) then { y := y + 13; while (y <= x) do { y := y + 13} } else skip, s[x->42,y->43]) => (if (43 <= x) then { y := y + 13; while (y <= x) do { y := y + 13} } else skip, s[x->42,y->43]) => (if (43 <= 42) then { y := y + 13; while (y <= x) do { y := y + 13} } else skip, s[x->42,y->43]) => (if (false) then { y := y + 13; while (y <= x) do { y := y + 13} } else skip, s[x->42,y->43]) => (skip, s[x->42,y->43]). ---------------------------------------------------------------- One thing that compilers have to do is transform code. Sometimes, these transformations are needed to eliminate high-level language features, and sometimes just do to optimizations. How can we prove that an optimization is valid? In general, we want the compiler to only replace commands with other commands that are "equivalent" (and faster.) But what do we mean by "equivalent"? They shouldn't be exactly the same, since we want to eliminate unnecessary computational steps. What we usually do is define some notion of external observations and optimize relative to those optimizations. For instance, we might say that commands c1 and c2 are equivalent if, given any input state s, (cs,s) =>* (skip,s') iff (c2,s) =>* (skip,s'). That is, the two commands take equal states to equal states. What are some optimizations we might consider? A simple one is to get rid of unnecessary skip statements. For instance: skip;c = c, and this is really easy to prove. But it also doesn't show up that much in practice (unless introduced by some other simplifying optimization.) Another optimization we might want to support is to replace if e then c else c with just c. Again, it's very easy to prove that these are equivalent commands, but this situation occurs very rarely. Here's a more interesting situation: Suppose c is a command that does not assign to the variable x, but only reads from it. Then we can rewrite: x := i; c to x := i; c[i/x] (i.e., the command c with i substituted for x.) This is called constant propagation. Similarly, we can rewrite x := y; c to x := y; c[y/x] (as long as neither x nor y are assigned within c.) This is called copy propagation. Finally, we can eliminate an assignment to a variable if the variable is never read. In particular, if we have: x := e; c and c never reads the value out of x, then it should be safe to replace this with just c. Another example is called loop-invariant removal. Consider a loop: while e1 do { c1; x:= e2; c2 } If e1, c1, and c2 never read or write x, then we can change this to: x := e2; while e1 do { c1; c2 } under what circumstances? Can you prove this is valid? Note: here is the definition of substituting an expression e for a variable x within a command c (written c[e/x]) or within another expression (written e'[e/x]): skip[e/x] = skip (x := e')[e/x] = x := (e'[e/x]) (c1;c2)[e/x] = (c1[e/x]);(c2[e/x]) (if e' then c1 else c2)[e/x] = if e'[e/x] then c1[e/x] else c2[e/x] (while e' do c)[e/x] = while e'[e/x] do c[e/x] x[e/x] = e y[e/x] = y (y != x) i[e/x] = i true[e/x] = true false[e/x] = false (e1 op e2)[e/x] = (e1[e/x]) op (e2[e/x])