MoreStlcMore on the Simply Typed Lambda-Calculus

Simple Extensions to STLC

The simply typed lambda-calculus has enough structure to make its theoretical properties interesting, but it is not much of a programming language.
In this chapter, we begin to close the gap with real-world languages by introducing a number of familiar features that have straightforward treatments at the level of typing.

Numbers

Adding types, constants, and primitive operations for natural numbers is easy (as we saw in exercise stlc_arith).

Let Bindings

A more interesting extension... Let-bindings.
When writing a complex expression, it is often useful to give names to some of its subexpressions: this avoids repetition and often increases readability.
Syntax:
       t ::=                Terms
           | ...               (other terms same as before)
           | let x=t in t      let-binding

Reduction:
t1 --> t1' (ST_Let1)  

let x=t1 in t2 --> let x=t1' in t2
   (ST_LetValue)  

let x=v1 in t2 --> [x:=v1]t2
Typing:
Gamma ⊢ t1 ∈ T1      x>T1; Gamma ⊢ t2 ∈ T2 (T_Let)  

Gamma ⊢ let x=t1 in t2 ∈ T2

Pairs

In Coq, the primitive way of extracting the components of a pair is pattern matching. An alternative is to take fst and snd -- the first- and second-projection operators -- as primitives. Just for fun, let's do our pairs this way. For example, here's how we'd write a function that takes a pair of numbers and returns the pair of their sum and difference:
       \x : Nat*Nat.
          let sum = x.fst + x.snd in
          let diff = x.fst - x.snd in
          (sum,diff)

Syntax:
       t ::=                Terms
           | ...
           | (t,t)             pair
           | t.fst             first projection
           | t.snd             second projection

       v ::=                Values
           | ...
           | (v,v)             pair value

       T ::=                Types
           | ...
           | T * T             product type

Reduction...
t1 --> t1' (ST_Pair1)  

(t1,t2) --> (t1',t2)
t2 --> t2' (ST_Pair2)  

(v1,t2) --> (v1,t2')
t1 --> t1' (ST_Fst1)  

t1.fst --> t1'.fst
   (ST_FstPair)  

(v1,v2).fst --> v1
t1 --> t1' (ST_Snd1)  

t1.snd --> t1'.snd
   (ST_SndPair)  

(v1,v2).snd --> v2

Typing:
Gamma ⊢ t1 ∈ T1     Gamma ⊢ t2 ∈ T2 (T_Pair)  

Gamma ⊢ (t1,t2) ∈ T1*T2
Gamma ⊢ t ∈ T1*T2 (T_Fst)  

Gamma ⊢ t.fst ∈ T1
Gamma ⊢ t ∈ T1*T2 (T_Snd)  

Gamma ⊢ t.snd ∈ T2

Unit

Another handy base type, found especially in languages in the ML family, is the singleton type Unit.
Syntax:
       t ::=                Terms
           | ...               (other terms same as before)
           | unit              unit

       v ::=                Values
           | ...
           | unit              unit value

       T ::=                Types
           | ...
           | Unit              unit type
Typing:
   (T_Unit)  

Gamma ⊢ unit ∈ Unit

Is unit the only term of type Unit?
(1) Yes
(2) No

Sums

Many programs need to deal with values that can take two distinct forms. For example, we might identify students in a university database using either their name or their id number. A search function might return either a matching value or an error code.
These are specific examples of a binary sum type (sometimes called a disjoint union), which describes a set of values drawn from one of two given types, e.g.:
       Nat + Bool

We create elements of these types by tagging elements of the component types, telling on which side of the sum we are putting them. E.g.,
   inl 42  ∈ Nat + Bool
   inr tru ∈ Nat + Bool
In general, the elements of a type T1 + T2 consist of the elements of T1 tagged with the token inl, plus the elements of T2 tagged with inr.

As we've seen in Coq programming, one important use of sums is signaling errors:
      div ∈ Nat -> Nat -> (Nat + Unit)
      div =
        \x:Nat. \y:Nat.
          test iszero y then
            inr unit
          else
            inl ...

Values of sum type are "destructed" by case analysis:
    getNat ∈ Nat+Bool -> Nat
    getNat =
      \x:Nat+Bool.
        case x of
          inl n => n
        | inr b => test b then 1 else 0

Syntax:
       t ::=                Terms
           | ...               (other terms same as before)
           | inl T t           tagging (left)
           | inr T t           tagging (right)
           | case t of         case
               inl x => t
             | inr x => t

       v ::=                Values
           | ...
           | inl T v           tagged value (left)
           | inr T v           tagged value (right)

       T ::=                Types
           | ...
           | T + T             sum type

Reduction:
t1 --> t1' (ST_Inl)  

inl T2 t1 --> inl T2 t1'
t2 --> t2' (ST_Inr)  

inr T1 t2 --> inr T1 t2'
t0 --> t0' (ST_Case)  

case t0 of inl x1 => t1 | inr x2 => t2 -->
case t0' of inl x1 => t1 | inr x2 => t2
   (ST_CaseInl)  

case (inl T2 v1) of inl x1 => t1 | inr x2 => t2
--> [x1:=v1]t1
   (ST_CaseInr)  

case (inr T1 v2) of inl x1 => t1 | inr x2 => t2
--> [x2:=v2]t2

Typing:
Gamma ⊢ t1 ∈ T1 (T_Inl)  

Gamma ⊢ inl T2 t1 ∈ T1 + T2
Gamma ⊢ t2 ∈ T2 (T_Inr)  

Gamma ⊢ inr T1 t2 ∈ T1 + T2
Gamma ⊢ t ∈ T1+T2
x1>T1; Gamma ⊢ t1 ∈ T
x2>T2; Gamma ⊢ t2 ∈ T (T_Case)  

Gamma ⊢ case t of inl x1 => t1 | inr x2 => t2 ∈ T
We use the type annotation in inl and inr to make the typing relation simpler, similarly to what we did for functions.

What does the following term step to (in one step)?

    let f = \x : Nat + Bool.
              case x of inl nn + 3 | inr b ⇒ 0 in
    f (inl Bool 4)
(1)
  (\x : Nat + Bool.
     case x of inl nn + 3 | inr b ⇒ 0) (inl Bool 4)
(2)
  7
(3)
  case inl Bool 4 of inl nn + 3 | inr b ⇒ 0
(4)
  f (inl Bool 4)
What about this one?

  (\x : Nat + Bool.
     case x of inl nn + 3 | inr b ⇒ 0) (inl Bool 4)
(1)
  7
(2)
  case inl Bool 4 of inl nn + 3 | inr b ⇒ 0
(3)
  4 + 3
What about this one?

  case inl Bool 4 of inl nn + 3 | inr b ⇒ 0
(1) 4 + 3
(2) 7
(3) 0

Lists

Syntax:
       t ::=                Terms
           | ...
           | nil T
           | cons t t
           | lcase t of nil  => t
                      | x::x => t

       v ::=                Values
           | ...
           | nil T             nil value
           | cons v v          cons value

       T ::=                Types
           | ...
           | List T            list of Ts

Reduction:
t1 --> t1' (ST_Cons1)  

cons t1 t2 --> cons t1' t2
t2 --> t2' (ST_Cons2)  

cons v1 t2 --> cons v1 t2'
t1 --> t1' (ST_Lcase1)  

(lcase t1 of nil => t2 | xh::xt => t3) -->
(lcase t1' of nil => t2 | xh::xt => t3)
   (ST_LcaseNil)  

(lcase nil T of nil => t2 | xh::xt => t3)
--> t2
   (ST_LcaseCons)  

(lcase (cons vh vt) of nil => t2 | xh::xt => t3)
--> [xh:=vh,xt:=vt]t3

Typing:
   (T_Nil)  

Gamma ⊢ nil T ∈ List T
Gamma ⊢ t1 ∈ T      Gamma ⊢ t2 ∈ List T (T_Cons)  

Gamma ⊢ cons t1 t2 ∈ List T
Gamma ⊢ t1 ∈ List T1
Gamma ⊢ t2 ∈ T
(h>T1; t>List T1; Gamma) ⊢ t3 ∈ T (T_Lcase)  

Gamma ⊢ (lcase t1 of nil => t2 | h::t => t3) ∈ T

General Recursion

Another facility found in most programming languages (including Coq) is the ability to define recursive functions. For example, we would like to be able to define the factorial function like this:
      fact = \x:Nat.
                test x=0 then 1 else x * (fact (pred x)))
Note that the right-hand side of this binder mentions the variable being bound -- something that is not allowed by our formalization of let above.
Directly formalizing this "recursive definition" mechanism is possible, but it requires some extra effort: in particular, we'd have to pass around an "environment" of recursive function definitions in the definition of the step relation.

Here is another way of presenting recursive functions that is a bit more verbose but equally powerful and much more straightforward to formalize: instead of writing recursive definitions, we will define a fixed-point operator called fix that performs the "unfolding" of the recursive definition in the right-hand side as needed, during reduction.
For example, instead of
      fact = \x:Nat.
                test x=0 then 1 else x * (fact (pred x)))
we will write:
      fact =
          fix
            (\f:Nat->Nat.
               \x:Nat.
                  test x=0 then 1 else x * (f (pred x)))

We can derive the latter from the former as follows:
  • In the right-hand side of the definition of fact, replace recursive references to fact by a fresh variable f.
  • Add an abstraction binding f at the front, with an appropriate type annotation. (Since we are using f in place of fact, which had type NatNat, we should require f to have the same type.) The new abstraction has type (NatNat) (NatNat).
  • Apply fix to this abstraction. This application has type NatNat.
  • Use all of this as the right-hand side of an ordinary let-binding for fact.

Syntax:
       t ::=                Terms
           | ...
           | fix t             fixed-point operator
Reduction:
t1 --> t1' (ST_Fix1)  

fix t1 --> fix t1'
   (ST_FixAbs)  

fix (\xf:T1.t2) --> [xf:=fix (\xf:T1.t2)] t2
Typing:
Gamma ⊢ t1 ∈ T1->T1 (T_Fix)  

Gamma ⊢ fix t1 ∈ T1

Let's see how ST_FixAbs works by reducing fact 3 = fix F 3, where
    F = (\f. \x. test x=0 then 1 else x * (f (pred x)))
(type annotations are omitted for brevity).
    fix F 3
--> ST_FixAbs + ST_App1
    (\x. test x=0 then 1 else x * (fix F (pred x))) 3
--> ST_AppAbs
    test 3=0 then 1 else 3 * (fix F (pred 3))
--> ST_Test0_Nonzero
    3 * (fix F (pred 3))
--> ST_FixAbs + ST_Mult2
    3 * ((\x. test x=0 then 1 else x * (fix F (pred x))) (pred 3))
--> ST_PredNat + ST_Mult2 + ST_App2
    3 * ((\x. test x=0 then 1 else x * (fix F (pred x))) 2)
--> ST_AppAbs + ST_Mult2
    3 * (test 2=0 then 1 else 2 * (fix F (pred 2)))
--> ST_Test0_Nonzero + ST_Mult2
    3 * (2 * (fix F (pred 2)))
--> ST_FixAbs + 2 x ST_Mult2
    3 * (2 * ((\x. test x=0 then 1 else x * (fix F (pred x))) (pred 2)))
--> ST_PredNat + 2 x ST_Mult2 + ST_App2
    3 * (2 * ((\x. test x=0 then 1 else x * (fix F (pred x))) 1))
--> ST_AppAbs + 2 x ST_Mult2
    3 * (2 * (test 1=0 then 1 else 1 * (fix F (pred 1))))
--> ST_Test0_Nonzero + 2 x ST_Mult2
    3 * (2 * (1 * (fix F (pred 1))))
--> ST_FixAbs + 3 x ST_Mult2
    3 * (2 * (1 * ((\x. test x=0 then 1 else x * (fix F (pred x))) (pred 1))))
--> ST_PredNat + 3 x ST_Mult2 + ST_App2
    3 * (2 * (1 * ((\x. test x=0 then 1 else x * (fix F (pred x))) 0)))
--> ST_AppAbs + 3 x ST_Mult2
    3 * (2 * (1 * (test 0=0 then 1 else 0 * (fix F (pred 0)))))
--> ST_Test0Zero + 3 x ST_Mult2
    3 * (2 * (1 * 1))
--> ST_MultNats + 2 x ST_Mult2
    3 * (2 * 1)
--> ST_MultNats + ST_Mult2
    3 * 2
--> ST_MultNats
    6

Records

As a final example, records can be presented as a generalization of pairs:
  • they are n-ary (rather than binary);
  • they are accessed by label (rather than position).

Syntax:
       t ::=                          Terms
           | ...
           | {i1=t1, ..., in=tn}         record
           | t.i                         projection

       v ::=                          Values
           | ...
           | {i1=v1, ..., in=vn}         record value

       T ::=                          Types
           | ...
           | {i1:T1, ..., in:Tn}         record type
This is a quite informal definition compared to previous ones:
  • it uses "..." in the syntax for records
  • it omits a usual side condition that the labels of a record should not contain repetitions.

Reduction:
ti --> ti' (ST_Rcd)  

{i1=v1, ..., im=vm, in=ti , ...}
--> {i1=v1, ..., im=vm, in=ti', ...}
t1 --> t1' (ST_Proj1)  

t1.i --> t1'.i
   (ST_ProjRcd)  

{..., i=vi, ...}.i --> vi
  • In the first rule, ti must be the leftmost field that is not a value;
  • In the last rule, there should be only one field called i, and all the other fields must contain values.

The typing rules are also simple:
Gamma ⊢ t1 ∈ T1     ...     Gamma ⊢ tn ∈ Tn (T_Rcd)  

Gamma ⊢ {i1=t1, ..., in=tn} ∈ {i1:T1, ..., in:Tn}
Gamma ⊢ t ∈ {..., i:Ti, ...} (T_Proj)  

Gamma ⊢ t.i ∈ Ti
Because of all the informality in the notations we've chosen, formalizing all this takes some work. See the Records chapter for more details.