The Lambda Calculus

Lambda calculus is a notation for describing mathematical functions and
programs. It is a mathematical system for studying the interaction of
**functional abstraction** and **functional application**.
It captures some
of the essential, common features of a wide variety of programming
languages. Because it directly supports abstraction, it is a more
natural model of universal computation than a Turing machine is.

- a variable
*x*∈**Var**, where**Var**is a countably infinite set of variables; - an
**application**, a function*e*_{0}applied to an argument*e*_{1}, usually written*e*_{0}*e*_{1}or*e*_{0}(*e*_{1}); or - a
**lambda abstraction**, an expression λ*x*.*e*representing a function with input parameter*x*and body*e*. Where a mathematician would write*x*↦*x*^{2}, or an SML programmer would write`fn x => x*x`

, in the λ-calculus we write λ*x*.*x*^{2}.

In BNF notation,

We use the metavariablee::=x| λx.e|e_{0}e_{1}

Parentheses are used just for grouping; they have no meaning on their
own. Like other familiar binding constructs from mathematics (e.g.,
sums, integrals), lambda terms are greedy, extending as far to the right
as they can. Therefore, the term λ*x*. λ*y*. *y* is the
same as λ*x*.(*x* (λ*y*.*y*)), not
(λ*x*.*x*) (λ*y*.*y*). As in SML, application terms are
left-associative, so *x* *y* *z* is the same thing as (*x* *y*) *z*.

For simplicity, multiple variables may be placed after the lambda, and
this is considered shorthand for having a lambda in front of each
variable. For example, we write λ*xy*.*e* as shorthand for
λ*x*.λ*y*.*e*. This shorthand is an example of syntactic
sugar. The process of removing this shorthand is **currying**,
just as in SML, and adding it is called **uncurrying**.

We can apply a curried function like λ*x*.λ*y*.*x* one argument
at a time. Applying it to one argument results in a function that takes
in a value for *x* and returns a constant function, one that returns the
value of *x* no matter what argument it is applied to. As this
suggests, functions are just ordinary values, and can be the results of
functions or passed as arguments to functions (even to themselves!).
Thus, in the lambda calculus, functions are first-class values: lambda
terms serve both as functions and data.

Just as in SML, occurrences of variables in a term can be bound or free.
The term λx.e binds all the free occurrences of
*x* in *e*.
The scope of *x* is *e*.
This is called **lexical scoping**, because the variable's scope is
defined by the text of the program. It is “lexical” because it is
possible to determine its scope before the program runs, just by inspecting the
program text. A term is **closed** if all variables are bound.
A term is **open** if it is not closed.

How do we run a λ-calculus program?
The main computational rule is β-reduction, which we are already familiar with
from SML. This rule applies
whenever there is a subterm of the form (λ*x*.*e*) *e*' representing the
application of a function λ*x.e* to an argument *e*'.
To perform β-reduction, we substitute the argument
*e*' for all free occurrences
of the formal parameter *x* in the body *e*, obtaining *e*{*e*'/*x*}.
This agrees with SML and
corresponds to our intuition for what the function λ*x*.*e* means.

We have to be a little careful; we cannot always substitute *e*'
blindly for *x* in *e*, because bad things could happen which could alter
the meaning of expressions in undesirable ways. We only want to
replace the free occurrences of x within e, because any other
occurrences are bound to a different binding; they are really different
variables. There are some additional subtleties to substitution when *e*' is not closed, in particular the problem of **variable capture**.

The β reduction (λ*x*.*e*) *e*' →
*e*{*e*'/*x*} is the basic computational step of the
λ-calculus. In the pure λ-calculus, we can start with a term and perform
β-reductions on subterms in any order. However, for modeling programming
languages, it is useful to restrict which β reductions are allowed and in what
order they can be performed.

The lambda calculus is all about functions. It's easy to define higher-order functions that operate on other functions. For example, we can define a function COMPOSE that operates on two functions f and g to produce the composition of f and g, written f∘g. The definition is that (f∘g)(x) = f(g(x)), which we can then directly abstract:

COMPOSE = λfg. λx.f(gx)

We can use COMPOSE to define a higher-order function TWICE that operates on a function f to produce a new function that applies f twice:

TWICE = λf. COMPOSEff

= λf. (λfg. λx.f(gx))ff

= λf. λx.f(fx) (by β reduction inside the lambda abstraction)

If we want to define a function that applies another function *four* times, we can just apply TWICE to itself: FOURTIMES = TWICE TWICE.

In general there may be many possible β-reductions that can be performed on a given term. How do we choose which beta reductions to perform next? Does it matter?

A specification of which β-reduction to perform next is called a
**reduction strategy**. Under a given reduction strategy,
a value is a term on which no β-reductions are permitted.
chosen reduction strategy. For example, λ*x*.*x* would always be
value, whereas ((λ*x*.*x*) 1) would
not be under usual reduction strategies.

Most real programming languages based on
the λ-calculus, such as SML,
use a reduction strategy known as **call by value** (CBV),
in which functions may only be applied to (called on) values.
Thus (λ*x*.e) e'
only β-reduces if *e*' is a value *v*.
Here is an example of a CBV evaluation sequence,
assuming 3, 4, and S
(the successor function) are appropriately defined:

((λx.λy.y x) 3) S

→ (λy.y 3) S

→ S 3

→ 4

Another important strategy is **call by name** (CBN). We defer
evaluation of arguments until as late as possible, applying reductions
from left to right within the expression. In other words, we can pass
an incomplete computation to a function as an argument. Terms are
evaluated only once their value is really needed.

Let us define an expression we call Ω:

Ω = (λx.x x) (λx.x x)

What happens when we try to evaluate it?

Ω = (λx.x x) (λx.x x) → (x x){(λx.x x)/x} = Ω

We have just coded an infinite loop! When an expression e can go through
infinite sequence of evaluation steps, we say e **diverges**.
This corresponds to nontermination.

What happens if we try using Ω as a parameter? It depends on the reduction strategy. Consider this example:

e = (λx.(λy.y)) Ω

Using the CBV evaluation strategy, we must first reduce Ω. This puts the evaluator into an infinite loop, so e diverges. On the other hand, CBN reduces the term above to λy.y, the identify function. So CBN avoids divergence because it is lazy about evaluating arguments. CBN has an important property: CBN will not loop infinitely unless every other semantics would also loop infinitely, yet it agrees with CBV whenever CBV terminates successfully.

One feature we seem to be missing is the ability to declare local
variables. For example, in SML we introduce a new local
variable with the "let" expression:
```
let val x = e
```

.
_{1} in e_{2} end

We expect this expression to evaluate *e*_{1} to some value *v*_{1} and then to substitute
for occurrences of *x*, obtaining
*e*_{2}{*v*_{1}/*x*}. But we
know another term that evaluates to the same expression:

(λx.e_{2})e_{1}→ ... →

(λx.e_{2})v_{1}→

e_{2}{v_{1}/x}

Therefore, we can view a `let`

expression as simply syntactic sugar
for an application of a lambda abstraction. We don't really need to have
`let`

in the language.

Perhaps the simplest interesting kind of value is a Boolean. We would like to define terms that act like the Boolean constants TRUE and FALSE and the Boolean operators IF, AND, OR, NOT, so that all these terms behave in the expected way, obeying the boolean abstraction. There are many reasonable encodings into lambda calculus. The standard approach is to define TRUE and FALSE functions that return the first and second of their two arguments, respectively:

TRUE = λx.λy.x

FALSE = λx.λy.y

It's important to realize that the tokens TRUE and FALSE are not part of the lambda calculus. We are just using them as abbreviations for the terms λx.λy.x and λx.λy.y.

Now, what about the conditional test IF? We would like
IF to take three arguments *b, t, f*, where *b* is a Boolean value
and *t*, *f* are arbitrary terms.
The function should return *t*
if *b*=TRUE and *f* if *b*=FALSE.
Now the reason for defining TRUE and FALSE the way we did
becomes clear. Since TRUE *t* *f* → *t* and
FALSE *t* *f* → *f*, all IF has to do is apply its
Boolean argument to the other two arguments:

IF = λb.λt.λf.b t fThe other Boolean operators can be defined using IF, TRUE, and FALSE:

AND = λbb'. IF b b' FALSE

OR = λbb'. IF b TRUE b'

NOT = λb.IF b FALSE TRUE

These operators work correctly when given Boolean values as we have defined them, but all bets are off if they are applied to any other term. There is no guarantee of any kind of reasonable behavior. Evaluation will proceed and reductions will be applied, but there is no way to usefully interpret the result. With the untyped λ-calculus, it is garbage in, garbage out.

We can encode natural numbers using **Church numerals**,
introduced by Alonzo Church. The Church numeral for the
number *n* is denoted __n__. It is the function that takes in a function f
and produces a new function that is the *n*-fold composition of f with itself:

n= λf.f^{n}

Of course, we can't write the exponential directly in lambda calculus. But we can build it up inductively:

0= λf.λx.x1= λf.λx.f x2= λf.λx.f(f x) (= TWICE)3= λf.λx.f(f(f x))... n= λf.λx.f(f(...(f x))

The most basic operation on a number *n* is to find its successor
*S*(*n*), which is *n*+1.
We can define the successor function *S* as

S = λn.λf.λx.f((n f) x)

In words, S on input __n__ returns a function that takes a function *f* as input, applies __n__ to it to get the *n*-fold
composition of *f* with itself, then composes that
with one more *f* to get the (*n*+1)-fold composition of *f*
with itself. Then,

S(n) = (λnfx.f(nfx))n

→ λfx.f(nfx)

=n+1.

We can perform more interesting arithmetic with Church numerals. We might define addition as follows:

ADD = λmn.λfx.(mf) ((nf)x))

If applied to arguments __m__ and __n__, ADD returns a number
that if applied to *f*, returns the function that applies *f* the desired number of times:
*m*+*n*.
This works because we are composing *f*^{m} with *f*^{n} to get *f*^{m+n}. Therefore, we could
have defined ADD = λ*mn*.λ*f*. COMPOSE (*m*
*f*) (*n* *f*).

As another alternative, recall that Church numerals act on a function to apply that function repeatedly, and addition is equivalent to repeated application of the successor function S, so we could define ADD as:

ADD = λmn.(mS)n

If we apply ADD in curried fashion to one argument __n__,
we will get a function that adds *n* to whatever it
is applied to.

Similarly, multiplication is just iterated addition, and exponentiation is iterated multiplication:

MUL = λmn.m (ADD n)0

EXP = λmn.m (MUL n)1

These last two functions can be defined even more compactly. COMPOSE will act like MUL on numbers. And EXP can be defined as:

EXP = λmn.nm

Defining subtraction, division, and other arithmetic operations is possible but more tricky. A good exercise is to figure out how to construct a predecessor function on the natural numbers.

Church numerals only encode natural numbers, but they can be used to build more complex kinds of numbers such as integers and floating point numbers.

Logic and arithmetic are good places to start, but we still have no
useful data structures. For example, consider ordered pairs. We
expect to have a pairing constructor PAIR with projection
operations FIRST and SECOND that obey the following equations
for any *e*_{1}, *e*_{2}, *p*:

FIRST (PAIRe_{1}e_{2}) =e_{1}

SECOND (PAIRe_{1}e_{2}) =e_{2}

PAIR (FIRSTp) (SECONDp) =p

provided *p* is a pair. We can take a hint from IF.
Recall that IF selects one of its two branch options depending on
its Boolean argument. PAIR can do something similar, wrapping its
two arguments for later extraction by some function *f*:

PAIR = λThus PAIRab. λf.fab

`#1`

and `#2`

:
FIRST = λp.pTRUE

SECOND = λp.pFALSE

Again, if *p* isn't a term constructed using PAIR, expect the unexpected.

With an encoding for IF, we have some control over the flow of a
program. We can also write simple `for`

loops using the Church
numerals __n__, by applying the Church numeral to the loop body
expressed as a function. However, we do not yet have the ability to
write an unbounded `while`

loop or a recursive function.

In ML, we can write the factorial function recursively as

fact(n) = if n <= 1 then 1 else n*fact(n-1)

But how can we write this in the lambda calculus, where all the functions are anonymous? We must somehow construct a term FACT that satisfies the following equation, assuming appropriate definitions for SUB and a less-than-or-equal comparison LEQ:

FACT = λn. IF (LEQ n 1) 1 (MUL n (FACT (SUB n 1)))

We can't just define FACT with this equation, because FACT appears on the right-hand side. Therefore it doesn't make sense as a shorthand for the term on the right. But it turns out we can define a term that acts in the way we want.

Suppose we break up the recursion into two steps. Since FACT can't
refer to itself directly, we need some way to pass FACT into
itself so it can be used for the recursion. We start by defining a function F
that adds an extra argument *f* for passing in something
similar to FACT:

F = λf. λn. IF (LEQn1) 1 (MULn(f(SUBn1)))

Now, if we just had FACT available, we could pass it to this
function and get the FACT we want. Of course, that's begging the
question. But we do have something rather similar to FACT:
this function F itself. That won't quite work either, because this
function expects to receive the extra argument *f*. So we just
change the function definition so it calls *f* with an extra argument:
*f*. Call this new function FACT':

FACT' = λf. λn. IF (LEQ n1)1(MUL n ((f f)(SUB n1)))

FACT' has the property that if a function just like it is passed
to it, it will do what FACT should. Since
(FACT' FACT') should behave as we want FACT to, we
define FACT as exactly this **self-application**:

FACT = FACT' FACT'

If we try this definition of FACT, we can see the recursion working:

FACT(4) = (FACT' FACT') 4

= (λn.IF (LEQ n1)1(MUL n ((FACT' FACT') (SUB n1)))})4

= IF (LEQ41)1(MUL4(FACT (SUB41)))

= (MUL4(FACT (SUB41)))

= (MUL4(FACT3))

...

= (MUL4(MUL3(MUL21))) =24

This self-application trick can be used to implement all recursive functions. It even works in SML, though you have to use a recursive datatype to convince the type system that it is type-safe:

We've seen that many interesting features of SML can be encoded using just lambda calculus functions. The lambda calculus is in fact a universal model of computation that is just as powerful as other realizable models such as Turing machines (which we can easily prove by implementing a Turing machine using lambda calculus).