MoreStlcMore on the Simply Typed Lambda-Calculus

Simple Extensions to STLC

The simply typed lambda-calculus has enough structure to make its theoretical properties interesting, but it is not much of a programming language.

In this chapter, we begin to close the gap with real-world languages by introducing a number of familiar features that have straightforward treatments at the level of typing.

Numbers

Adding types, constants, and primitive operations for natural numbers is easy (as we saw in exercise stlc_arith).

Let Bindings

A more interesting extension... Let-bindings.

When writing a complex expression, it is often useful to give names to some of its subexpressions: this avoids repetition and often increases readability.

Syntax:

       t ::=                Terms
           | ...               (other terms same as before)
           | let x=t in t      let-binding

Reduction:

t₁ --> t₁'	(ST_Let1)

let x=t₁ in t₂ --> let x=t₁' in t₂

	(ST_LetValue)

let x=v₁ in t₂ --> [x:=v₁]t₂

Typing:

Gamma ⊢ t₁ ∈ T₁ x⊢>T₁; Gamma ⊢ t₂ ∈ T₂	(T_Let)

Gamma ⊢ let x=t₁ in t₂ ∈ T₂

Pairs

In Coq, the primitive way of extracting the components of a pair is pattern matching. An alternative is to take fst and snd -- the first- and second-projection operators -- as primitives. Just for fun, let's do our pairs this way. For example, here's how we'd write a function that takes a pair of numbers and returns the pair of their sum and difference:

       \x : Nat*Nat.
          let sum = x.fst + x.snd in
          let diff = x.fst - x.snd in
          (sum,diff)

Syntax:

       t ::=                Terms
           | ...
           | (t,t)             pair
           | t.fst             first projection
           | t.snd             second projection

       v ::=                Values
           | ...
           | (v,v)             pair value

       T ::=                Types
           | ...
           | T * T             product type

Reduction...

t₁ --> t₁'	(ST_Pair1)

(t₁,t₂) --> (t₁',t₂)

t₂ --> t₂'	(ST_Pair2)

(v₁,t₂) --> (v₁,t₂')

t₁ --> t₁'	(ST_Fst1)

t₁.fst --> t₁'.fst

	(ST_FstPair)

(v₁,v₂).fst --> v₁

t₁ --> t₁'	(ST_Snd1)

t₁.snd --> t₁'.snd

	(ST_SndPair)

(v₁,v₂).snd --> v₂

Typing:

Gamma ⊢ t₁ ∈ T₁ Gamma ⊢ t₂ ∈ T₂	(T_Pair)

Gamma ⊢ (t₁,t₂) ∈ T₁*T₂

Gamma ⊢ t ∈ T₁*T₂	(T_Fst)

Gamma ⊢ t.fst ∈ T₁

Gamma ⊢ t ∈ T₁*T₂	(T_Snd)

Gamma ⊢ t.snd ∈ T₂

Unit

Another handy base type, found especially in languages in the ML family, is the singleton type Unit.

Syntax:

       t ::=                Terms
           | ...               (other terms same as before)
           | unit              unit

       v ::=                Values
           | ...
           | unit              unit value

       T ::=                Types
           | ...
           | Unit              unit type

Typing:

	(T_Unit)

Gamma ⊢ unit ∈ Unit

Is unit the only term of type Unit?

(1) Yes

(2) No

Sums

Many programs need to deal with values that can take two distinct forms. For example, we might identify students in a university database using either their name or their id number. A search function might return either a matching value or an error code.

These are specific examples of a binary sum type (sometimes called a disjoint union), which describes a set of values drawn from one of two given types, e.g.:

       Nat + Bool

We create elements of these types by tagging elements of the component types, telling on which side of the sum we are putting them. E.g.,

   inl 42  ∈ Nat + Bool
   inr tru ∈ Nat + Bool

In general, the elements of a type T₁ + T₂ consist of the elements of T₁ tagged with the token inl, plus the elements of T₂ tagged with inr.

As we've seen in Coq programming, one important use of sums is signaling errors:

      div ∈ Nat -> Nat -> (Nat + Unit)
      div =
        \x:Nat. \y:Nat.
          test iszero y then
            inr unit
          else
            inl ...

Values of sum type are "destructed" by case analysis:

    getNat ∈ Nat+Bool -> Nat
    getNat =
      \x:Nat+Bool.
        case x of
          inl n => n
        | inr b => test b then 1 else 0

Syntax:

       t ::=                Terms
           | ...               (other terms same as before)
           | inl T t           tagging (left)
           | inr T t           tagging (right)
           | case t of         case
               inl x => t
             | inr x => t

       v ::=                Values
           | ...
           | inl T v           tagged value (left)
           | inr T v           tagged value (right)

       T ::=                Types
           | ...
           | T + T             sum type

Reduction:

t₁ --> t₁'	(ST_Inl)

inl T₂ t₁ --> inl T₂ t₁'

t₂ --> t₂'	(ST_Inr)

inr T₁ t₂ --> inr T₁ t₂'

t₀ --> t₀'	(ST_Case)

case t₀ of inl x₁ => t₁ \| inr x₂ => t₂ -->
case t₀' of inl x₁ => t₁ \| inr x₂ => t₂

	(ST_CaseInl)

case (inl T₂ v₁) of inl x₁ => t₁ \| inr x₂ => t₂
--> [x₁:=v₁]t₁

	(ST_CaseInr)

case (inr T₁ v₂) of inl x₁ => t₁ \| inr x₂ => t₂
--> [x₂:=v₂]t₂

Typing:

Gamma ⊢ t₁ ∈ T₁	(T_Inl)

Gamma ⊢ inl T₂ t₁ ∈ T₁ + T₂

Gamma ⊢ t₂ ∈ T₂	(T_Inr)

Gamma ⊢ inr T₁ t₂ ∈ T₁ + T₂

Gamma ⊢ t ∈ T₁+T₂
x₁⊢>T₁; Gamma ⊢ t₁ ∈ T
x₂⊢>T₂; Gamma ⊢ t₂ ∈ T	(T_Case)

Gamma ⊢ case t of inl x₁ => t₁ \| inr x₂ => t₂ ∈ T

We use the type annotation in inl and inr to make the typing relation simpler, similarly to what we did for functions.

What does the following term step to (in one step)?

    let f = \x : Nat + Bool.
              case x of inl n ⇒ n + 3 | inr b ⇒ 0 in
    f (inl Bool 4)

(1)
(\x : Nat + Bool.
case x of inl n ⇒ n + 3 | inr b ⇒ 0) (inl Bool 4)

(2)
7

(3)
case inl Bool 4 of inl n ⇒ n + 3 | inr b ⇒ 0

(4)
f (inl Bool 4)

What about this one?

(\x : Nat + Bool.
case x of inl n ⇒ n + 3 | inr b ⇒ 0) (inl Bool 4)

(1)
7

(2)
case inl Bool 4 of inl n ⇒ n + 3 | inr b ⇒ 0

(3)
4 + 3

What about this one?

case inl Bool 4 of inl n ⇒ n + 3 | inr b ⇒ 0

(1) 4 + 3

(2) 7

(3) 0

Lists

Syntax:

       t ::=                Terms
           | ...
           | nil T
           | cons t t
           | lcase t of nil  => t
                      | x::x => t

       v ::=                Values
           | ...
           | nil T             nil value
           | cons v v          cons value

       T ::=                Types
           | ...
           | List T            list of Ts

Reduction:

t₁ --> t₁'	(ST_Cons1)

cons t₁ t₂ --> cons t₁' t₂

t₂ --> t₂'	(ST_Cons2)

cons v₁ t₂ --> cons v₁ t₂'

t₁ --> t₁'	(ST_Lcase1)

(lcase t₁ of nil => t₂ \| xh::xt => t₃) -->
(lcase t₁' of nil => t₂ \| xh::xt => t₃)

	(ST_LcaseNil)

(lcase nil T of nil => t₂ \| xh::xt => t₃)
--> t₂

	(ST_LcaseCons)

(lcase (cons vh vt) of nil => t₂ \| xh::xt => t₃)
--> [xh:=vh,xt:=vt]t₃

Typing:

	(T_Nil)

Gamma ⊢ nil T ∈ List T

Gamma ⊢ t₁ ∈ T Gamma ⊢ t₂ ∈ List T	(T_Cons)

Gamma ⊢ cons t₁ t₂ ∈ List T

Gamma ⊢ t₁ ∈ List T₁
Gamma ⊢ t₂ ∈ T
(h⊢>T₁; t⊢>List T₁; Gamma) ⊢ t₃ ∈ T	(T_Lcase)

Gamma ⊢ (lcase t₁ of nil => t₂ \| h::t => t₃) ∈ T

General Recursion

Another facility found in most programming languages (including Coq) is the ability to define recursive functions. For example, we would like to be able to define the factorial function like this:

      fact = \x:Nat.
                test x=0 then 1 else x * (fact (pred x)))

Note that the right-hand side of this binder mentions the variable being bound -- something that is not allowed by our formalization of let above.

Directly formalizing this "recursive definition" mechanism is possible, but it requires some extra effort: in particular, we'd have to pass around an "environment" of recursive function definitions in the definition of the step relation.

Here is another way of presenting recursive functions that is a bit more verbose but equally powerful and much more straightforward to formalize: instead of writing recursive definitions, we will define a fixed-point operator called fix that performs the "unfolding" of the recursive definition in the right-hand side as needed, during reduction.

For example, instead of

      fact = \x:Nat.
                test x=0 then 1 else x * (fact (pred x)))

we will write:

      fact =
          fix
            (\f:Nat->Nat.
               \x:Nat.
                  test x=0 then 1 else x * (f (pred x)))

We can derive the latter from the former as follows:

In the right-hand side of the definition of fact, replace recursive references to fact by a fresh variable f.
Add an abstraction binding f at the front, with an appropriate type annotation. (Since we are using f in place of fact, which had type Nat→Nat, we should require f to have the same type.) The new abstraction has type (Nat→Nat) → (Nat→Nat).
Apply fix to this abstraction. This application has type Nat→Nat.
Use all of this as the right-hand side of an ordinary let-binding for fact.

Syntax:

       t ::=                Terms
           | ...
           | fix t             fixed-point operator

Reduction:

t₁ --> t₁'	(ST_Fix1)

fix t₁ --> fix t₁'

	(ST_FixAbs)

fix (\xf:T₁.t2) --> [xf:=fix (\xf:T₁.t2)] t₂

Typing:

Gamma ⊢ t₁ ∈ T₁->T₁	(T_Fix)

Gamma ⊢ fix t₁ ∈ T₁

Let's see how ST_FixAbs works by reducing fact 3 = fix F 3, where

    F = (\f. \x. test x=0 then 1 else x * (f (pred x)))

(type annotations are omitted for brevity).

    fix F 3

--> ST_FixAbs + ST_App1

    (\x. test x=0 then 1 else x * (fix F (pred x))) 3

--> ST_AppAbs

    test 3=0 then 1 else 3 * (fix F (pred 3))

--> ST_Test0_Nonzero

    3 * (fix F (pred 3))

--> ST_FixAbs + ST_Mult2

    3 * ((\x. test x=0 then 1 else x * (fix F (pred x))) (pred 3))

--> ST_PredNat + ST_Mult2 + ST_App2

    3 * ((\x. test x=0 then 1 else x * (fix F (pred x))) 2)

--> ST_AppAbs + ST_Mult2

    3 * (test 2=0 then 1 else 2 * (fix F (pred 2)))

--> ST_Test0_Nonzero + ST_Mult2

    3 * (2 * (fix F (pred 2)))

--> ST_FixAbs + 2 x ST_Mult2

    3 * (2 * ((\x. test x=0 then 1 else x * (fix F (pred x))) (pred 2)))

--> ST_PredNat + 2 x ST_Mult2 + ST_App2

    3 * (2 * ((\x. test x=0 then 1 else x * (fix F (pred x))) 1))

--> ST_AppAbs + 2 x ST_Mult2

    3 * (2 * (test 1=0 then 1 else 1 * (fix F (pred 1))))

--> ST_Test0_Nonzero + 2 x ST_Mult2

    3 * (2 * (1 * (fix F (pred 1))))

--> ST_FixAbs + 3 x ST_Mult2

    3 * (2 * (1 * ((\x. test x=0 then 1 else x * (fix F (pred x))) (pred 1))))

--> ST_PredNat + 3 x ST_Mult2 + ST_App2

    3 * (2 * (1 * ((\x. test x=0 then 1 else x * (fix F (pred x))) 0)))

--> ST_AppAbs + 3 x ST_Mult2

    3 * (2 * (1 * (test 0=0 then 1 else 0 * (fix F (pred 0)))))

--> ST_Test0Zero + 3 x ST_Mult2

    3 * (2 * (1 * 1))

--> ST_MultNats + 2 x ST_Mult2

    3 * (2 * 1)

--> ST_MultNats + ST_Mult2

    3 * 2

--> ST_MultNats

Records

As a final example, records can be presented as a generalization of pairs:

they are n-ary (rather than binary);
they are accessed by label (rather than position).

Syntax:

       t ::=                          Terms
           | ...
           | {i₁=t₁, ..., in=tn}         record
           | t.i                         projection

       v ::=                          Values
           | ...
           | {i₁=v₁, ..., in=vn}         record value

       T ::=                          Types
           | ...
           | {i₁:T₁, ..., in:Tn}         record type

This is a quite informal definition compared to previous ones:

it uses "..." in the syntax for records
it omits a usual side condition that the labels of a record should not contain repetitions.

Reduction:

ti --> ti'	(ST_Rcd)

{i₁=v₁, ..., im=vm, in=ti , ...}
--> {i₁=v₁, ..., im=vm, in=ti', ...}

t₁ --> t₁'	(ST_Proj1)

t₁.i --> t₁'.i

	(ST_ProjRcd)

{..., i=vi, ...}.i --> vi

In the first rule, ti must be the leftmost field that is not a value;
In the last rule, there should be only one field called i, and all the other fields must contain values.

The typing rules are also simple:

Gamma ⊢ t₁ ∈ T₁ ... Gamma ⊢ tn ∈ Tn	(T_Rcd)

Gamma ⊢ {i₁=t₁, ..., in=tn} ∈ {i₁:T₁, ..., in:Tn}

Gamma ⊢ t ∈ {..., i:Ti, ...}	(T_Proj)

Gamma ⊢ t.i ∈ Ti

Because of all the informality in the notations we've chosen, formalizing all this takes some work. See the Records chapter for more details.