Background on type systems:

A type error occurs when we attempt to do some primitive operation
(such as function application) but the operation is not defined for
the given operands.  For instance, there is no transition rule for
applying an integer to an integer, so if we ever get into a situation
where we are doing this, we have a type error:

  (\f.f 42) 3 => 3 42  stuck!

In the pure, untyped lambda calculus, the only built-in operation we
had was function application, and the only values we had were
functions.  Thus, for a (closed) expression, there was no way to get a
type error.

As soon as we add something besides functions, we have the potential
for type errors.  We say that a language is *type safe* if there
is no well-formed program that can get stuck due to a type error.

The goal of a type system is to determine whether or not a given
expression will ever evaluate to a configuration that is stuck due to
a type error, and if so, rule out that expression from the language.
Note that in general, it is undecidable whether or not a program can
get stuck (i.e., you can solve the halting problem if you claim you
have a perfect type checker).  So, all static type checkers are
conservative.

An alternative to static type checking is dynamic type checking.
Dynamic type checking augments the semantics so that, no matter what
primitive operation we're doing, and no matter what values we have, we
always have a transition we can take.  For instance, in Scheme, if you
attempt to apply 3 to 42, then this transitions to a special error
value:

  (\f.f 42) 3 => 3 42 => error

Similarly, if you attempt add "error" to a number, you get the error
value, and so forth.  Dynamic type checking has its benefits, because
it does not suffer from the conservative aspects of static typing.
However, dynamic type checking has two drawbacks: First, it requires
run-time type information and run-time checks for primitive
operations.  For example, in Scheme, when you see v1 applied to v2,
you have to check to see whether or not v1 is a function, and if so,
then use the normal function-call semantics.  Otherwise, you have to
return the error value.  Thus, we need to be able to tell at run-time
whether or not a value is a function.  Similarly, if you attempt to
add two values, we have to dynamically check that the two values are
numbers, and if not, return the error value.  So, we must be able to
tell numbers from all other values.

The second problem with dynamic type checking is that you must
exhaustively test your program to rule out type errors.  Modern
languages make it easy to write statically-typed programs (i.e., the
type systems are not so conservative that they rule out lots of useful
programs) so static type- checking is a good way to test, once and for
all, that your program will not have a type error when run.  From a
software engineering standpoint, this is a crucial advantage as it
allows you to concentrate on real bugs, instead of silly little type
errors.  (And if you think that type errors don't really happen in
practice for dynamically typed languages, you are either very foolish
or omnipotent.)

In practice, all type-safe languages use a mix of static and dynamic
typing.  For instance, ML is mostly statically typed but some things
are checked dynamically, including division by zero, and array bounds.
This is because a simple analysis won't easily be able to prove
integer constraints (e.g., x != 0 or 0 <= x < size(A).)  That's not to
say that we couldn't devise a type system which statically enforces
these things, rather that the type systems tend to become complicated.
When designing a language, you want to keep your type system simple,
yet expressive so that you can rule out most type errors without
making the programmer mad at you.

Java is also mostly statically typed, but unfortunately, it requires
many more (implicit) run-time checks than ML.  For instance, in
addition to divide by zero and array subscripts, Java implementations
must check that when you write into an array, then the type of the
object is equal to the element type of the array.  As we'll see, a
slight change to the Java type system would've made this check
unnecessary.  As another example, Java implementations must check for
NULL pointers on method call or member access, and they must 
check a sub-typing relation on a downcast.

More expressive type systems can often avoid run-time checks.  For
instance, in ML, by default, "pointers" cannot be NULL.  You can
code up "pointers that might be NULL" using a datatype:

  datatype 'a option = None | Some of 'a

In Java, you can't even express a non-null reference.  So, you're
forced to do a lot of testing that's really unnecessary.  There are
proposals to strengthen Java's type system to support "not-null"
references.  Surprisingly, if we chose "not-null" as the default, most
code would type-check as is!  This was a real bug in the Java design
as far as I'm concerned.

Similarly, downcasts are not needed nearly as often if you have some
form of parametric polymorphism (as in ML).  Here, Sun has finally
gotten its act together and incorporated polymorphism (aka generics)
in the next release of Java.  Sadly, they didn't fix the array update
problem which they could've (generics solve this nicely.)

But ML isn't without its faults.  There's no subtyping in ML, as there
is in Java.  So, in practice, you're often forced to duplicate code in
ML for different types, even though it's the same code.  (Actually,
you can often encode the subtyping with polymorphism, but not always.)
Duplicating code is a bad thing because if there's a bug in one copy,
it's likely there's a bug in the other copy.  Testing might only
reveal one bug, so you'll have to remember to hunt down all of the
other copies and make the bug fixes there too.  Or, if you go to
performance- tune the code, you'll have to do it in multiple places.

A key principle of language design is that the language should provide
enough power to abstract out common bits of code so that you only have
to write them once.  This simplifies testing, debugging, and tends to
lead to smaller and more robust code.  A restrictive type system (as
say in Pascal) prevents code sharing, which is why some people still
think that dynamic typing is the best thing.  But with the right type
structure, you can usually avoid code duplication without having to do
exhaustive testing.

Proving the soundness of the simply-typed lambda calculus:

Here's the simply-typed lambda calculus:

  types       t ::= b | t1 -> t2
  expressions e ::= c | x | \x.e | e1 e2
  values      v ::= c | \x.e

where we have some set of constants (c) described by a base type (b).
Think c is drawn from {1,2,3,...} and b = int if you like.

The (call-by-value) operational semantics is given by:

  (\x.e1) v => e1[v/x]

  e1 => e1'
  ---------------
  e1 e2 => e1' e2

  e2 => e2'
  -------------
  v e2 => v e2'

Recall that substitution e1[e2/x] is defined by:

  x[e2/x] = e2
  y[e2/x] = y   (y != x)
  c[e2/x] = c
  (e e')[e2/x] = (e[e2/x]) (e'[e2/x])
  (\x.e)[e2/x] = \x.e
  (\y.e)[e2/x] = \y.(e[e2/x])

(We are only substituting closed terms so we don't have to worry
about capture in the last case.)

The following rules define when, under a set of typing assumptions G,
a given expression e has type t (written G |- e : t).  Recall that
our typing assumption G is a partial function from variables to
types.  We can think of G as the "symbol table" in a compiler that
is recording which variables are in scope, and what their types are.

  (const)  G |- c : b         (integers have type int)

  (var)    G |- x : G(x)      (look up x in the symbol table)

  (lam)    G[x -> t1] |- e : t2
           --------------------  (assume x has type t1, check body has type t2)
           G |- \x.e : t1 -> t2

           G |- e1 : t'->t   G |- e2 : t'
  (app)    ------------------------------
           G |- e1 e2 : t

When there are no assumptions, we write |- e : t.

[Type Soundness]:  If  |- e : t and e =>* e', then e' is not stuck
due to a type error.

Our proof will be broken into a number of pieces.  First, there
are some lemmas to get out of the way:

[Substitution]:  If  {x -> t'} |- e1 : t  and  |- e2 : t', then
|- e1[e2/x] : t.
Proof:  by induction on the height of the proof that { x -> t' } |- e1 : t.
(Note that we only need to consider substituting values, but in a
call-by-name setting, you'd have to consider general [closed] expressions
so it's good to do the more general case here.)

[Canonical Forms]:  If |- v : t then
  (1) if t = b, then v = c for some constant c
  (2) if t = t1->t2, then v = \x.e for some function \x.e
Proof:  by inspection of the typing rules.


[Preservation]:  If |- e : t  and  e => e', then |- e' : t.
Proof:  by induction on the height of the proof that e => e'
(using Substitution as a lemma).


[Progress]:  If |- e : t, then either e = v for some value v,
or else there exists an e' such that e => e' (using Canonical
forms as a lemma.)
Proof:  by induction on the height of the proof that |- e : t.

Finally, we can prove our Type Soundness theorem by induction
on the length n of the sequence e =>n e'.  We are assuming that
|- e : t, so when n = 0, e' = e and thus, it is trivially the
case that |- e' : t.  By Progress, e' is either a value or else
it can step, so it is not stuck.

Suppose the theorem holds up to n and we have e =>n+1 e'.  Then we
have e =>n e'' => e'.  By our induction hypothesis, |- e'' : t.
So by Preservation, |- e' : t.  Then by Progress, e' is either
a value or else it can take a step, so it is not stuck.

---------------------------------------------------------------------
Type Checking and Type Inference:

In class, we discussed Milner's algorithm W at a very high- level.
The algorithm looks something like this, where TC(G,e) is a function
that takes a set of typing assumptions (G) and an expression to
type-check (e) and returns a type t, together with a set of equations
S on types.

  TC(G,c) = (b, S)
  TC(G,x) = (G(x), S)
  TC(G,\x.e) = 
    let ? = fresh_type_variable()
        (t,S) = TC(G[x->?],e)
    in
       (?->t, S)
    end
  TC(G,e1 e2) = 
    let (t1,S1) = TC(G,e1)
        (t2,S2) = TC(G,e2)
        let ? = fresh_type_variable()
        S = S1 + S2 + { t1 = t2 -> ? }
    in
       (?, S)
    end

Suppose TC(G,e) = (t,S).  And suppose that we solve for the unknown
type variables in S (i.e., we discover that ?1 = b, ?2 = b->b, etc.)
If we apply the solution to t (i.e., compute t[b/?1, b->b/?2,...],
then that is the type of the expression.  Of course, there may not be
a solution to the equations in S.  They could have nonsensical things
like b = b->b or ? = ?->b.  If this is the case, then there is a
type-error in the program.

We can solve for the unknowns by rewriting the equations until they
are simplified:

  S + { b = b }  =>  S     

  S + { t1->t2 = t1'->t2' } => S + {t1=t1', t2=t2'} 

  ? not in t    S[t/?] => S'   
  ---------------------------
  S + {? = t} => S' + {? = t}

If we keep simplifying the equations, we'll either get into a
situation where there is some unsatisfiable equation, such as:

   b = t1 -> t2

or

   ? = ? -> t

or else we'll end up with equations of the form:

   ?1 = t1, ?2 = t2, .... ?n = tn

Then, the final answer is obtained by taking the result type of the
expression and substituting the types for the unknown type variables:

  t[t1/?1,...,?n = tn]

If we get stuck in the simplification process, then there was a
type-error somewhere.  That is, there is no way to prove that the
expression is well-typed.

In practice, the process of solving the equations is done online with
a fast algorithm called resolution or unification (due to Robinson in
1968).  The algorithm runs in (almost) linear time which makes it very
fast and very scalable.  It's implemented by unknowns (?) as reference
cells that are initially NULL.  When we encounter an equation of the
form ? = t, we simply set the pointer in ? to point to t.  When
comparing two types, we always go through the ? pointers (if present).
And to get the (almost) linear time algorithm, we must do path
compression in the form of some sort of union/find data structure.

Solving the equations online tends to produce better error messages
since we detect inconsistencies earlier.  But still, you can get
some strange error messages because the equation simplification
process (unification) is so disconnected from the program itself.

Type *checking* as opposed to type inference does not generate
unknowns or have to solve for them.  Rather, it just checks that
two types are appropriately related (i.e., equal or subtypes or
whatever.)  We can eliminate the need for unknowns in this little
language by simply requiring that all functions be labelled with
their argument types:

  \x:t.e

Then type-checking simplifies to:

TC(G,c) = b
TC(G,x) = G(x)
TC(G,\x:t1.e) = t1 -> TC(G[x->t1],e)
TC(G,e1 e2) = 
  let (t1,S1) = TC(G,e1)
      (t2,S2) = TC(G,e2)
  in
     case t1 of
      ta->tb => if (t2 != ta) then type error
                else tb
    | _ => type error
  end

That's why most programming languages force you to put types
on function parameters.  
---------------------------------------------------------------------
Scaling the language up:

In class, we talked about a lot of extensions that you can
add to the language, including unit, pairs, n-tuples, records,
void, binary sums, n-ary sums, and datatypes.  We also talked
about adding a constant "fix" for achieving recursion.  

I will summarize the discussion by writing down the extensions
to the syntax, the evaluation rules, and the typing rules.

Unit:

  types       t ::= unit
  expressions e ::= ()
  values      v ::= ()

  G |- () : unit

Pairs:

  types       t ::= t1*t2
  expressions e ::= (e1,e2) | #1 e | #2 e
  values      v ::= (v1,v2)

  #1 (v1,v2) => v1

  #2 (v1,v2) => v2

  e => e'
  -----------------
  (e,e2) => (e',e2)
  (e2,e) => (e2,e')


  G |- e1 : t1   G |- e2 : t2
  ---------------------------
  G |- (e1,e2) : t1*t2

Void:

  types       t ::= 0
  expressions e ::= cast(e,t)
  values      v ::= <none>

  G |- e : void
  ------------------
  G |- cast(e,t) : t

  Some explanation is in order here -- there are no (closed) values
  of type void.  So, if I write a function such as:

    \x:void.x 

  I know that at run-time, the function can never be called (since
  there are no values to pass to the function.)  In general, whenever
  we have a value of empty type, we can pretend as if we can create
  an object of any other type out of thin air.  This corresponds
  to the logical notion that if you assume "false" then you can
  prove anything you like.

Binary Sums:

  types       t ::= t1+t2
  expressions e ::= inl(t1+t2,e) | inr(t1+t2,e) | 
                    (case e of inl(x1) => e1 | inr(x2) => e2)
  values      v ::= inl(t1+t2,v) | inr(t1+t2,v) 

  (case inl(t1+t2,v) of inl(x1) => e1 | inr(x2) => e2)  =>  e1[v/x1]
  (case inr(t1+t2,v) of inl(x1) => e1 | inr(x2) => e2)  =>  e2[v/x2]

  e => e'
  -----------------------------------
  (case e of ...) => (case e' of ...)


  G |- e : t1
  -------------------------
  G |- inl(t1+t2,e) : t1+t2

  G |- e : t2
  -------------------------
  G |= inr(t1+t2,e) : t1+t2

  G |- e : t1+t2   G[x1->t1] |- e1 : t     G[x2->t2] |- e2 : t
  ------------------------------------------------------------
  G |- case e of inl(x1) => e1 | inr(x2) => e2 : t

Note that for sums, the inl(-) and inr(-) are generic data
constructors.  If you like, you can pretend that ML has only one
datatype definition:

  datatype ('a,'b) sum = inl of 'a | inr of 'b

I've added type information to the data constructors so you do not
have to infer what the sum type is.

Records and datatypes are generalizations of products and sums.  In
the case of records, we have a collection of values indexed by labels.
In the case of sums, we have a disjoint union tagged by labels.
Alternatively, you can think of a product as nothing more than a
record with labels 1 and 2.
---------------------------------------------------------------------
Recursion:

Even with these additions, it's impossible to construct a well-typed
program that loops forever.  Thus, this extended language is not
Turing complete.  We can regain Turing completeness by adding a few
more things, notably some way to loop or recurse.

The simplest way to do this is to add a new constant called "fix".
In a call-by-value setting, fix behaves as follows:


  fix(\f.e) => (\f.e) (fix(\f.e))

You can use fix to code up recursive functions as follows.  Suppose
you want to write the recursive factorial function.  In ML, we'd
write:

  fun fact(n) = if (n <= 1) then 1 else n * fact(n-1)

Note that, again, this is an equation, not really a definition so we
want to solve the equation for the "real" fact function.

One way to do this is to define:

  val fact_body = fn f => fn n => if (n <= 1) then 1 else n * fact(n-1)

Then we can define:

  val fact = fix(fact_body)

Note that 

  fact = fix(fact_body) => 
         fact_body (fix(fact_body)) =>
         fn n => if (n <= 1) then 1 else n * (fix(fact_body))(n-1) =
         fn n => if (n <= 1) then 1 else n * fact(n-1)

So, if you apply fact to a number (e.g., 3) here's what happens:

  fact(3) = (fn n => if (n <= 1) then 1 else n * fact(n-1))(3) =>
            if (3 <= 1) then 1 else 3 * (fix(fact_body))(3-1) =>
            3 * (fix(fact_body))(2) =>
            3 * (fn n => if (n <= 1) then 1 else n * fact(n-1))(2) =>
            3 * (if (2 <= 1) then 1 else 2 * (fix(fact_body))(2-1)) =>
            3 * 2 * (fix(fact_body))(1) =>
            3 * 2 * (fn n => if (n <= 1) then 1 else n * fact(n-1))(1) =>
            3 * 2 * (if (1 <= 1) then 1 else n * fact(n-1)) =>
            3 * 2 * 1 =>
            6

What is the type of fact?  In a call-by-value setting it
needs to be something like:

  G |- e : (t1->t2)->(t1->t2)
  ---------------------------
  G |- fix(e) : t1->t2

[NB: I had a typo in the original notes here.  The above rule
is now right.]

So, we pass to fix a function which abstracts the recursive call as an
extra argument (the first t1->t2).  The function then takes in a t1
value and returns a t2 (typically by recursing.)  If we pass such a
function fix, then it goes ahead and unrolls the loop for us one time,
giving us back a function which takes a t1 to a t2.  Every time we
call the function, it gets unrolled one more time.  This is exactly
the same thing as what happend in the semantics of while-loops for the
denotational semantics of IMP.

---------------------------------------------------------------------
Homework:

1. Prove the Substitution lemma.  Include your proof in an ML comment
at the beginning of your code for problem 2.

2. Write a type-checker for the simply-typed lambda calculus with
unit, pairs, and sums.  Use the following definitions:

   datatype Type = Int_t | Arrow_t of Type*Type | Unit_t | Prod_t of Type*Type | 
                   Sum_t of Type*Type

   type var = string
   datatype exp = 
      Var of string |        (* x *)
      Int of int |           (* i *)
      Plus of exp*exp |      (* e1 + e2 *)
      Fn of var*Type*exp |   (* \x:t.e *)
      App of exp*exp |       (* e1(e2) *)
      Unit |                 (* () *)
      Pair of exp*exp |      (* (e1,e2) *)
      Num1 of exp |          (* #1 e *)
      Num2 of exp |          (* #2 e *)
      Inl of Type * Type * exp |    (* Inl[t1+t2](e) *)
      Inr of Type * Type * exp |    (* Inr[t1+t2](e) *)
      Case of exp * (var * exp) * (var * exp)  (* case e of Inl(x) => e1
                                                          | Inr(y) => e2 *)
   exception TypeError of string
   type assump = var -> Type   (* typing assumptions *)
   val empty_assump : assump = fn x => raise TypeError("unbound variable")

Your job is to write the function:

   val type_check : assump * exp -> Type