Lecture 6: Scope and Substitution

Administrivia

Problem set #2 was handed out on Wednesday evening. Don’t do it in pairs (you can work in pairs for PS3-5).

RDZ is away next week.

Syntax and Semantics of Procedures

Now we need to add various things to our BNF table, to make fn part of the syntax, and to eval, to give fn the correct semantics. We also need to add identifiers, which are variable names. Both identifiers and anonymous functions are expressions, as is a particular expression called a combination. Finally, we need to add types.

syntactic class	syntactic variable(s) and grammar rule(s)	examples
identifiers	x, y	`a`, `x`, `y`, `x_y`, `ford1000`, ...
constants	c	...`~2`, `~1`, `0`, `1`, `2` (integers) `1.0`, `~0.001`, `3.141` (reals) `true`, `false` (booleans) `"hello"`, `""`, `"!"` (strings) `#"A"`, `#" "` (characters)
unary operator	u	`~`, `not`, `size`, ...
binary operators	b	`+`, `*`, `-`, `>`, `<`, `>=`, `<=`, `^`, ...
expressions (terms)	e ::= x \| c \| u e \| e₁ b e₂ \| `if` e `then` e `else` e \| fn `(`x₁:t₁, ..., x_n:t_n`):` t = e \| e `(`e₁`,` ...`,` e_n`)`	`ford`, `~0.001`, `not` `b`, `2 + 2`,
types	t ::= `int` \| `real` \| `bool` \| `string` \| `char` \| t₁``...``t_n`->`t	`int`, `string`, `int->int`, `bool*int->bool`

Rule #E6 [combinations]: to evaluate e1(e2), evaluate e1 to a function (fn (id:t) => e), then evaluate e2 to a value v, then substitute v for the formal parameter id within the body of the function e to yield an expression e'. Finally, evaluate e' to a value v'. The result is v'.

eval(e1(e2)) = v'  where

  (0) eval(e1) = (fn (id:t) => e)

  (1) eval(e2) = v

  (2) substitute([(id,v)],e) = e'

  (3) eval(e') = v'

Scope, Identifiers, and Substitution

Having briefly introduced let, we can now turn our attention to the issue of what it means to substitute a value for a variable.

Scope

We can define various functions but we need to avoid collisions. Often we only "need" a certain name within a certain piece of code (literally within). Where an identifier is defined is called its scope. This issue can be very confusing when you type things into ML, as opposed to loading a file into a fresh ML.

[Leave this example for section]

Here is a more complex function declaration which finds (an approximation to) the square root of a real number.

Underlying math fact: for any positive x, g, it is the case that g, x/g lie on opposite sides of sqrt(x).

(* Computes the square root of x using Heron of Alexandria's

 * algorithm (circa 100 AD). We "guess" that the square root

 * is 1.0 and then continue improving the guess until we're

 * with delta of the real answer.  The improvement is achieved

 * by averaging the current guess with x/guess.

*)

fun square_root(x: real): real =

let

    (* used to tell when the approximation is good enough *)

    val delta = 0.0001

    (* returns true iff the guess is good enough *)

    fun good_enough(guess: real): bool =

      abs(guess*guess - x) < delta

    (* improve the guess by averaging it with x/guess *)

    fun improve(guess: real): real =

      (guess + x/guess) / 2.0

    (* try a particular guess -- looping and improving the

     * guess if it's not good enough. *)

    fun try_guess(guess: real): real =

      if good_enough(guess) then guess

      else try_guess(improve(guess))

in

    (* start with a guess of 1.0 *)

    try_guess(1.0)

end

This is example shows a number of things. First, you can declare local values (such as delta) and local functions (such as abs, good_enough, improve, and try_guess.) Notice that "inner" functions, such as improve, can refer to outer variables (such as x). Also notice that later definitions can refer to earlier definitions. For instance, try_guess refers to both good_enough and improve. Finally, notice that try_guess is a recursive function -- it's really a loop. It's similar to writing something like:

while (!good_enough(guess)) {

   guess = try_guess(improve(guess));

in an imperative language such as Java or C.

If you type the square_root declaration above into the SML top-level, it responds with:

val square_root : fn real -> real

indicating that you've declared a variable (square_root), that its value is a function (fn), and that its type is a function from reals to reals. All of the internal structure of the function definition is hidden; all we know from the outside is that its value is a simple function. In particular, the function "try" is not defined!

After typing in the function, you might try it out on a real number such as 9.0:

- square_root(9.0);

  val it = 3.00000000014 : real

SML has evaluated the expression "square_root(9.0)" and printed the value of the expression (3.00000000014) and the type of the value (real).

At the moment we have only a sloppy, imprecise notion of exactly what happens when you type this expression into ML. In a few weeks we'll have a precise understanding (hopefully!)

If you try to apply square_root to an expression that does not have type real (say an integer or a boolean), then you'll get a type error:

- square_root(9);

stdIn:27.1-27.14 Error: operator and operand do not agree [literal]

operator domain: real

operand:         int

in expression:

  square_root 9

Binding and identifiers

Consider an ML expression like the one below:

let val ford = 3

    val ford:int->bool =

            fn(ford:int) => ford = ford

in

 ford(3)  (* how about ford(42)? *)

end

Note: we will occasionally use examples like this for pedagogical purposes. They are a good way to ensure that you understand the language. We do not expect to see this in your code!

There are three different ways that one can use an identifier:

A binding occurrence, which binds the identifier to a particular value or type. For example, in the expression let val z:int = 1 in z end, the first occurrence is a binding occurrence that binds z to 1. Each binding occurrence introduces a new variable, and this new variable has a scope: a part of the program in which uses of that identifier refer to the variable. In this case the scope of the variable z is the body of the let expression.
A bound occurrence is a use of a variable in the scope of a variable binding. For each bound occurrence of a variable, there is a single corresponding binding of that variable. For example, in the expression (fn(z:int)=>z), the second occurrence of x is a bound occurrence; its corresponding binding occurrence is the first occurrence. At run time this variable will be bound to whatever value is passed to the function when it is invoked.
An unbound or free occurrence is a use of an identifier with no corresponding binding occurrence in whose scope. For example, in the expression let val y:ford = z+1 in y end, the use of z is an unbound occurrence because there is no containing binding of z. The identifier ford is also an unbound occurrence of a type identifier. A legal SML program cannot contain an unbound occurrence of an identifier. However, for the purpose of understanding how SML works, sometimes it is useful to write down syntactically legal fragments of SML programs and talk about the unbound variables that occur in them.

Note that an occurrence depends upon context! If I simply write “z” on the board there is no way of telling what this is. It’s not even necessarily an identifier. Consider:

z

“zero”

zero

let zero:int = 0 in E end

(fn(z:int) => z)

Given an occurrence of an identifier that is not a binding occurrence, there is a simple way to figure out whether it is bound or unbound, and if the former, to which binding occurrence the identifier is bound. An identifier is bound if it is in the scope of a binding occurrence. For ML programs, the scope of a variable can be seen by simply looking at the program text. If the variable lies within the scope of more than one binding occurrence, then one of those bindings shadows the rest. It will be the binding occurrence whose scope most tightly encloses the use of the identifier.

In SML it is possible to figure out just by looking at the program code which occurrence binds each use of a variable. A language with this property is said to have lexical scoping : the scope of each variable is apparent from the lexical form of the program, without knowing anything about how the program runs.

The alternative to lexical scoping is dynamic scoping, in which a given variable occurrence may have different binding occurrences depending on how the program runs. In most modern languages, such as Java or C, variable have lexical scope. However, Perl and Python are examples of languages with dynamic variable scoping. Dynamic scope is harder to implement efficiently, and can lead to unpleasant surprises for programmers because variables don't always mean what they expect.

To see an example of dynamic versus lexical scoping, consider the following functions (perhaps written in different files):

(* Dynamically scoped example *)

fun arthur():int =

  answer

fun ford():int =

  let val answer:int = 42

in

    arthur()

end

In order to communicate with a function in a statically scoped language, we pass in arguments! This has the advantage of explicitly annotating what information is being supplied (i.e., makes it easier to think about the contract)

(* Lexically scoped example *)

fun arthur2(answer:int):int =

  answer

fun ford2():int =

  let val answer:int = 42

in

    arthur2(answer)

end

Substitution

Earlier we saw some rewriting rules that explained how to evaluate terms of the SML language. For example, we said that a simple expression evaluates according to the following rewrite rule:

let val x:t = v in e end --> e (with occurrences of x replaced by v)

Remember that we use e to stand for an arbitrary expression (term), x to stand for an arbitrary identifier, and v to stand for a value -- that is, a fully evaluated term.

We now know this cannot be the full story, because e₂ may contain occurrences of x whose binding occurrence is not this binding x:t = v₁. It doesn't make sense to substitute v for these occurrences. For example, consider evaluation of the expression:

let val x:int = 1

    fun f(x:int) = x

    val y:int = x+1

in

  fn(a:string) => x*2

end

The next step of evaluation replaces the green occurrences of x with 1, because these occurrences have the first declaration as their binding occurrence. Notice that the two occurrences of x inside the function f, which are respectively a binding and a bound occurrence, are not replaced. Thus, the result of this rewriting step is

let fun f(x:int) = x

    val y:int = 1+1

in

  fn(a:string) => 1*2

end

This is actually a very important example, and illustrates referential transparency. The person writing this function expects it to compute the identity! Nothing involving the variable x that happens before or after (in the code or during execution) should affect this.

Let's write the substitution e{v/x} to mean the expression e with all unbound occurrences of x replaced by the value v. To remember which way the slash goes, think of multiplication: we want x{v/x} = v, which wouldn’t work if we wrote x{x/v}.

Then we can restate the rule for evaluating let more simply:

let val x:t = v in e end --> e{v/x}

This works because any occurrence of x in e must be bound by exactly this declaration val x:t = v. Here are some examples of substitution:

x{2/x} = 2
x{2/y} = x
(fn(y:int)=>x) {"hi"/x} = (fn(y:int)=>"hi")
f(x) { fn(y:int)=>y / f } = (fn(y:int)=>y)(x)

One of the features that makes ML fairly unique is the ability to write complex patterns containing binding occurrences of variables. Pattern matching in ML causes these variables to be bound in such a way that the pattern matches the supplied value. This is can be a very concise and convenient way of binding variables. We can generalize the notation used above by writing e{v/p} to mean the expression e with all unbound occurrences of variables appearing in the pattern p replaced by the values obtained when p is matched against v. Using this notation, we can express the let rule simply:

let val p = v in e end --> e{v/p}

Example:

let val (x,y) = (40,2) in x+y end

What if a let expression introduces multiple declarations? Such an expression is identical in effect to a series of nested let expressions. Thus, we can use the following rewrite that pulls out the first declaration so the rules above apply.

let d₁...d_n  in e end  -->

  let d₁in let d₂...d_n in e end end

We can use the same substitution operator to give a more precise rule for what happens when a function is called. Consider a function declared as fun f(p) = e, where f is the identifier naming the function. Then the expression for a function call whose argument has been evaluated, f(v), is rewritten as follows:

f(v) --> e{v/p}

Similarly, consider a call to an anonymous function:

(fn( p )=> e )( v ) --> e{v/p}

Let syntax and semantics

OK, we now need to add a syntax and semantics for let. Conceptually it’s pretty easy, but there are a few details. We’ll start by giving a much more complete syntax for ML, including a bunch of things whose precise semantics we won’t cover for a while, if ever (such as datatypes).

syntactic class	syntactic variables and grammar rule(s)	examples
identifiers	x, y	`a`, `x`, `y`, `x_y`, `foo1000`, ...
datatypes, datatype constructors	X, Y	`Nil`, `Cons`, `list`
constants	c	...`~2`, `~1`, `0`, `1`, `2` (integers) `1.0`, `~0.001`, `3.141` (reals) `true`, `false` (booleans) `"hello"`, `""`, `"!"` (strings) `#"A"`, `#" "` (characters)
unary operator	u	`~`, `not`, `size`, ...
binary operators	b	`+`, `*`, `-`, `>`, `<`, `>=`, `<=`, `^`, ...
expressions (terms)	e ::= c \| x \| u e \| e₁ b e₂ \| `if` e₁ `then` e₂ `else` e₃ \| `let` d₁...d_n `in` e `end` \| e `(`e₁`,` ...`,` e_n`)` `\| (`e₁`,`...`,`e_n`)` `\| #`n e \| `{`x₁`=`e₁`,` ...`,` x_n`=`e_n`}` `\| #`x e \| X`(`e`)` \| `case` e `of` p₁`=>`e₁\| ... \| p_n`=>`e_n	`~0.001`, `foo`, `not` `b`,`2 + 2`, `Cons(2, Nil)`
patterns	p ::= c \| x \| `(`p₁`,`...`,` p_n`)` \| `{`x₁`=` p₁`,`...`,`x_n`=` p_n`}` \| X \| X `(` p `)`	`a:int`, `(x:int,y:int), I(x:int)`
declarations	d ::= `val` p = e \| `fun` y p `:` t = e \| `datatype` Y = X₁ [`of` t₁] `\|` ... `\|` X_n[`of` t_n]	`val one = 1` `fun square(x: int): int` `datatype d = N \| I of int`
types	t ::= `int` \| `real` \| `bool` \| `string` \| `char` \| t₁`->`t₂ \| t₁``...``t_n \| `{`x₁:t₁, x₂:t₂,..., x_n:t_n`}` \| Y	`int`, `string`, `int->int`, `bool*int->bool`
values	v ::= c \| `(`v₁`,`...`,`v_n`)` \| `{`x₁`=`v₁`,` ...`,` x_n`=`v_n`}` \| X`(`v`)`	`2`, `(2,"hello")`, `Cons(2,Nil)`

A program is now an expression or a declaration.

A substitution is a finite map from identifiers (variables) to expressions. We represent the substitution as a list of identifiers and expressions (i.e., [(x1,e1),(x2,e2),...,(xn,en)]). We can perform a substitution S = [(x1,e1),(x2,e2),...,(xn,en)] on an expression or declaration by first substituting e1 for x1, then e2 for x2, ..., then xn for en. This is what the specification substitute does.

Rule #E7 [let]: to evaluate let d in e end, evaluate the declaration d to get a substitution S. Perform the substitution S on e yielding a new expression e'. Then evaluate e' to get the final answer.

eval(let d in e end) = v where

  (0) eval_decl(d) = S

  (1) substitute(S,e) = e'

  (2) eval(e') = v

Rule #D1[val declarations]: to evaluate a declaration val id = e, evaluate e to a value v and match v against the identifier id to yield a substitution S. The substitution S is the result of the declaration.

eval_decl(val p = e) = S where

  (0) eval(e) = v

  (1) match(v,p) = S

Top-level loop, and variable scope

Remember that to find the value of a variable we just look textually (lexically) "up" from the piece of code where it is referenced, and find the first binding occurrence. You can think of the ML interpreter as having a giant LET statement before your code that gives lots of things their bindings. A decent model of val at top level is that it adds to this set of definitions.

val x = 27;

E ==

let val x:int = 27 in

end

So we can say

val triple = (fn(z:int):int => 3 * z);

triple(14) ==

let val triple:int->int =  (fn(z:int):int => 3 * z) in

  triple(14)

end

Higher-order procedures (or, where the substitution model first shows its value)

So far the substitution model looks pretty simplistic; it's a set of rules that tells you things you already know. However, understanding the substitution model is the key to prelim #1 and the first 1/3 of the course (as well as the final).

HOP's are the first example of something that is easy to understand if you really get the substitution model, and impossible otherwise.

We need one more thing in our language subset, namely fun. (fun fun…)

For the moment only, we will assume that fun is used just as a declaration, as in

let fun triple(z:int->int) = 3 * z in

  triple(14)

end

We will assume this is shorthand for saying

let val triple:int->int =  (fn(z:int):int => 3 * z) in

  triple(14)

end

Note that this is NOT the final story (like Newtonian physics), but a simplification for the moment. We will tell you when we change the story (relativity?)

Higher-order procedures : first examples

Functions are values just like any other value in SML. What does that mean exactly? This means that we can pass functions around as arguments to other functions, that we can store functions in data structures, that we can return functions as a result from other functions. The full implication of this will not hit you until later, but believe us, it will.

Let us look at why it is useful to have higher-order functions. The first reason is that it allows you to write more general code, hence more reusable code. As a running example, consider these functions

fun triple (z:int):int = 3 * z

fun cube (z:int):int = z * z * z

Let us now come up with a function to multiply a number by 9. We could do it directly, but for utterly twisted motives decide to use the function triple above

fun mul9 (z:int):int = triple (triple (z))

Now let's find a way to get the 9th power of a number:

fun pow9 (z:int):int = cube (cube (z))

There is a totally unexpected and unplanned similarity between these two functions: what they do is apply a given function twice to a value. By passing in the function to apply twice as an argument, we can reuse code:

fun apply_twice (f:int->int, x:int):int = f (f (x))

Using this we can now write

val x = apply_twice(triple,4)           (* 36 *)

fun new_mul9(z:int):int = apply_twice(triple,z)

val x2 = new_mul9(4)                    (* 36 *)

fun new_pow9(z:int):int = apply_twice(cube,z)

val x3 = new_pow9(4)                    (* 512 *)

I strongly recommend that those of you who wish to pass CS312 (and especially prelim #1) work out these examples by hand using the substitution model.

The advantage is that the similarity between these two functions has been made manifest. Doing this is very helpful. If someone comes up with an improved version of apply_twice, then every function that uses it profits from the improvement.

The function apply_twice is a so-called higher-order function: it is a function from functions to other values. Notice the type of apply_twice is ((int -> int) * int) -> int

In order not to pollute the top level namespace, it can be useful to locally define the function to pass in as an argument. For example:

fun fourth (x:int):int =

let

    fun square (y:int):int = y * y

in

    apply_twice (square,x)

end

However, it seems silly to define and name a function simply to pass it in as an argument to another function. After all, all we really care about is that apply_twice get a function that double its argument. So let's do that, using new notation:

fun fourth (x:int):int = apply_twice (fn (y:int) => y*y,x)

Anonymous functions are useful for creating functions to pass as arguments to other functions, but are also useful for writing functions that return other functions! Let us revisit the apply_twice function. We now write a function twice which takes a function as an argument and return a new function that applies the original function twice:

type base = int

type func = int -> int

fun twice_func (f:func):func =

  fn (x:base) => f (f (x))

val newer_mul9 = twice_func(triple);

val newer_pow9 = twice_func(cube);

fun compose (f:func, g:func):func =

  fn (x:base) => f (g (x))