CS312 Lecture 18: Types

Type Unification

Is the expression below well-typed in Mini-SML? Yes, and its return type is determined to be (int -> int) list list.

let fun f(x: int): int = 1 fun g(x: undefined): undefined = 2 in [[f],[],[g]] end

What about SML? No, even if we replace undefined with type variables.

- let fun f(x: int): int = 1 fun g(x: 'a): 'b = 2 in [[f],[],[g]] end;
stdIn:40.28-40.48 Error: right-hand-side of clause doesn't agree with function result type [literal]
  expression:  int
  result type:  'b
  in declaration:
    g = (fn x : 'a => 2: 'b)

This example shows that type-correct Mini-SML programs are not necessarily also type-correct in SML. The differences are due to our desire to have a simple implementation in Mini-SML but also, and mostly, to our desire to illustrate the liberty the designer and implementer of a programming language has in defining its features.

How does Mini-SML reach its conclusion? Let's take a look at the types of various subexpressions:

f  :  int -> int
[f]: (int -> int) list
g  :  undefined -> undefined
[g]: (undefined -> undefined) list
[] :  undefined list

At the top level expression [[f],[],[g]] is a list. Mini-SML knows that all elements of a list must have the same type, and it tries to unify the types of the various elements. This unification is 'optimistic' - unless the system finds an irreconcilable type clash, it will match (or unify) the candidate types. This is a two step, top-to-bottom process, in which the type of [f] is unified with the type of [], and the resulting type is unified with the type of [g].

Here is the code that achieves type unification in Mini-SML:

fun unifyTypes (t: typ, t': typ): typ =
  case (t, t') of
    (Undef_t,      _       )         => t'
  | (_,            Undef_t )         => t
  | (Int_t,        Int_t   )         => Int_t
  | (Real_t,       Real_t  )         => Real_t
  | (Bool_t,       Bool_t  )         => Bool_t
  | (Char_t,       Char_t  )         => Char_t
  | (String_t,     String_t)         => String_t
  | (Tuple_t(tl), Tuple_t(tl'))      =>
     if List.length(tl) <> List.length(tl') then raise TypeUnification
      else Tuple_t(ListPair.map (fn (ta,tb) => unifyTypes(ta,tb)) (tl, tl'))
  | (List_t(tl), List_t(tl'))        =>  List_t(unifyTypes(tl, tl'))
  | (Fn_t(fa, rt), Fn_t(fa', rt'))   =>
      Fn_t(unifyTypes(fa, fa'), unifyTypes(rt, rt'))
  | _                                => raise TypeUnification

We can now understand how dynamic type checking is done in Mini-SML. Keep in mind that the result of each computation is a (value, value type) pair.

A simple case:

and evaluateBinop (bop:binop, (v1,t1):value*typ,(v2,t2):value*typ):value*typ =
  ...
  (* binary operator *)
  case (bop, v1, v2) of
      (Plus,  Int_v  a, Int_v  b)         => (Int_v  (a+b), Int_t )
    | (Plus,  Real_v a, Real_v b)         => (Real_v (a+b), Real_t)
    | (Plus,  _,        _       )         => err "type error (+)"
    ...

... and a more complicated one:

fun evaluate (ex: exp, en: env): value * typ =
  ...
  case ex of
    ...
  | List_e elist          =>
       let
         val (v, t) = foldr (fn ((v, t), (vl, tl)) => (v::vl, t::tl))
                            ([],[])
                            (map (fn(e) => evaluate (e, en)) elist)
       in
         (List_v v,
          List_t (foldl (fn (ta, tb) => unifyTypes (ta, tb)) Undef_t t))
         handle TypeUnification => err "typewise inhomogenous list"
       end
  ...

Static Type Checking

Static type-checking can be performed by "executing" the program without performing computations, but only retrieving and checking type information. As opposed to dynamic type checking "execution" here requires that all alternatives (i.e. both branches in an if) be examined. As discussed above, in general we can't predict the execution path, so we have to examine types along all possible execution paths. The execution context will be retained in a type environment (an environment where only name and type information is retained, there is no need for storing values as well).

Static type checking can be easily formalized in a manner analogous to that employed when we defined the substitution model:

(* Constants:            *)
tcheck(env, c)  = bool   (when c is a boolean)
tcheck(env, c)  = int    (when c is an integer)
tcheck(env, c)  = real   (when c is a real)
tcheck(env, c)  = char   (when c is a char)
tcheck(env, c)  = string (when c is a string)
tcheck(env, []) = undefined list

(* Variables:            *)
tcheck(env, id) = lookupBinding(env,id)

(* Anonymous Functions:  *)
tcheck(env, fn (id: t1): t2 => e) = t1 -> t4
  when tcheck(insertBinding(env, id, t1), e) = t3
   and unifyTypes(t2, t3) = t4

(* Function Applications:*)
tcheck(env, e1(e2)) = t2
  when tcheck(env, e1) = t1 -> t2
   and tcheck(env, e2) = t3
   and unifyTypes(t1, t3) = _

(* Unary Operations:     *)
tcheck(env, u e) = unary_op_result_type(u, t1)
  when tcheck(env, e) = t1

(* Binary Operations:    *)
tcheck(env, e1 b e2) = binary_op_result_type(b, t1, t2)
  when tcheck(env, e1) = t1
   and tcheck(env, e2) = t2

(* Lists:                *)
tcheck(env, (e1, e2, ..., en)) = t
  when tcheck(env, ei) = ti (for 1 <= i <= n)
   and unifyTypes(t1, t2, ..., tn) = t

(* Tuples:               *)
tcheck(env, (e1, e2, ..., en)) = (t1 * t2 * ... * tn)
  when tcheck(env, ei) = ti (for 1 <= i <= n)

(* Tuple Projections:    *)
tcheck(env, #i e) = ti
  when tcheck(env, e) = (t1 * t2 * ... * tn) 
   and (1 <= i <= n)

(* If:                   *)
tcheck(env, if c then e1 else e2) = t3
  when tcheck(env, c) = bool
   and tcheck(env, e1) = t1
   and tcheck(env, e2) = t2
   and unifyTypes(t1, t2) = t3

(* Let:                  *)
tcheck(env, let d in e end) = t
  when declcheck(env, d) = env'
   and tcheck(env', e) = t

(* Val Declarations:     *)
declcheck(env, val v:t1 = e) = env'
  when tcheck(env, e) = t2
   and unifyTypes(t1, t2) = t3
   and insertBinding(env, v, t3) = env'

(* Fun Declarations:     *)
declcheck(env, fun id1(id2: t1): t2 = e) = env''
  when env' = insertBinding(insertBinding(env, id1, t1 -> t2), id2, t1)
   and tcheck(env', e) = t3
   and unifyTypes(t2, t3) = t4
   and insertBinding(env, id1, t1 -> t4) = env''

If any of the conditions that is checks is false, or if a type unification fails, the entire type checking procedure fails. The initial type environment should contain bindings for all predefined functions and special forms.

Functions unary_op_result and binary_op_result are used to express the fact that the type of result for overloaded operators depends in general on both the operation and the type of the arguments. For example, binary_op_result(op +, integer, integer) = integer, binary_op_results(op +, real, real) = real, but binary_op_results(op +, integer, real) produces an error.

The rules above do not cover the entire language, but they can easily be extended to handle all constructs in Mini-SML (and SML).

Notice that checking the type of a function relies on the assumption that the function has the very type that we are trying to check! This assumption is only used when type checking recursive functions, and it is similar to assuming that P(n) is true in order to prove P(n+1) in an induction.

Let us examine whether the definiton of fact below type-checks:

[00] declcheck([], fun fact(n: int): int = if n = 0 then 1 else n * fact(n - 1)) = [(fact, int->int)]
      when env' = insertBinding(insertBinding([], fact, int->int), n, int)
                = insertBinding([(fact, int->int)], n, int)
                = [(n, int), (fact, int->int)]
       and [01] tcheck([(n, int), (fact, int->int)], if n = 0 then 1 else n * fact(n - 1)) = int
       and unifyTypes(int, int) = int
       and insertBindin([], fact, int->int) = [(fact, int->int)]

[01] tcheck([(n, int), (fact, int->int)], if n = 0 then 1 else n * fact(n - 1)) = int
      when [02] tcheck([(n, int), (fact, int->int)], n = 0) = bool
       and [04] tcheck([(n, int), (fact, int->int)], 1) = int
       and [05] tcheck([(n, int), (fact, int->int)], n * fact(n - 1)) = int
       and unifyTypes(int, int) = int

[02] tcheck([(n, int), (fact, int->int)], n = 0) = unary_op_result_type(op =, int) = bool
      when [03] tcheck([(n, int), (fact, int->int)], n) = int

[03] tcheck([(n, int), (fact, int->int)], n) = lookupBinding([(n, int), (fact, int->int)], n) = int

[04] tcheck([(n, int), (fact, int->int)], 1) = int

[05] tcheck([(n, int), (fact, int->int)], n * fact(n - 1)) = binary_op_result_type(op *, int, int) = int
      when [06] tcheck([(n, int), (fact, int->int)], n) = int
       and [07] tcheck([(n, int), (fact, int->int)], n * fact(n - 1)) = int

[06] tcheck([(n, int), (fact, int->int)], n) = lookupBinding([(n, int), (fact, int->int)], n) = int

[07] tcheck([(n, int), (fact, int->int)], fact(n-1)) = int
      when [08] tcheck([(n, int), (fact, int->int)], fact) = int->int
       and [09] tcheck([(n, int), (fact, int->int)], n - 1) = int
       and unifyTypes(int, int) = _

[08] tcheck([(n, int), (fact, int->int)], fact) = lookupBinding([(n, int), (fact, int->int)], fact) = int->int

[09] tcheck([(n, int), (fact, int->int)], n - 1) = binary_op_result_type(op -, int, int) = int
      when [10] tcheck([(n, int), (fact, int->int)], n) = int
       and [11] tcheck([(n, int), (fact, int->int)], 1) = int

[10] tcheck([(n, int), (fact, int->int)], n) = lookupBinding([(n, int), (fact, int->int)], n) = int

[11] tcheck([(n, int), (fact, int->int)], 1) = int

Type environments allow for shadowing. The type of variable x in environment [(s, string), (x, int), (b, bool), (x, string)] is int because the first binding for x we find in the environment when examining it from left to right is (x, int).

Type Inference

Take a look at this SML example:

- fun nice(s) = s ^ ", please.";
val nice = fn : string -> string

How did SML determine the type of function nice? Well, it knows that the type of op ^ is string * string -> string, which means that s must be of type string. Thus the function takes a string and returns the result of op ^ (a string), which means that its type must be string -> string.

Mini-SML does not do type inference, except when it infers the type of a function, given the type of its arguments and that of the return value. It is a good exercise for you to think of how to implement type inference in Mini-SML.


CS312 home  © 2002 Cornell University Computer Science