Notes on Dylan

Dylan: A Dynamic Language

1. Introduction

Dylan is the programming language that we will use in CS212. Dylan incorporates ideas from a number of languages. Like Lisp (and now Java) it has no explicit access to pointers. Also like Lisp it uses a fully parenthesized prefix notation. Like Algol (and later Pascal) it is a block structured language. Dylan also supports the object-oriented programming style as one of several styles rather than as the only style (in this way it is more akin to C++ than to Java). Dylan is different from most languages that you probably have used previously. Following are the major differences that you will see in the early part of the course:

There is a simple uniform syntax for all expressions.
There are no procedures, only functions; indeed every "statement" has an associated "value" -- we call them expressions rather than statements to emphasize this.
There are no iteration constructs (e.g. for, while).

The chief differences between Dylan and a language such as C or Pascal are the fundamental role played by functions, the pervasive use of recursion, and support for object-oriented programming. (We will not cover the object-oriented features of Dylan until later in the semester). Programs written in Dylan have quite a different flavor than those written in an imperative language like Pascal. In Dylan, one almost never uses assignment (e.g., x := y). Instead, the value returned by a function call is passed

directly to another function to be processed. For example, in Pascal one might write:

x := 10 ; {assign x to be 10}
foo(x,y) ; {call foo with value param x and assign to var param y}
z := y*y ; {set z to the square of y}

while in Dylan one would write

(square (foo 10))

This can be read, from the inside out, as call foo with parameter 10, then call square with the result of the call to foo as its parameter. Of course, one could write this in Pascal, too, but instead one tends to use assignment, while in Dylan one tends not to. This leads to a programming style in which Dylan functions tend to be short (10 lines or so), and the work is done by calling many simple functions to build up a result.

2. Dylan Syntax

Dylan programs are built up out of expressions. An expression either is atomic or is a combination. An atomic expression is a single sequence of characters that doesn't contain any spaces or parentheses. The following are all atomic expressions:

26
14.265
+
square

A combination is one or more expressions enclosed in parentheses, where the first expression is the operator to be applied to the remaining expressions, which are called operands. Dylan is a prefix notation language, because the operator comes before its operands. Dylan is also fully parenthesized. The following are all combinations:

 (+ 1 2 3)
 (* 6 (+ 4 5))
 (+ 6.15 1)
 (square (+ 1 2))

In the first expression above, the operator + is an atomic expression that represents the addition operation. Each of the three remaining expressions --- 1, 2, 3 --- is also atomic. Evaluation of the expression yields the sum of the three integers 1, 2 and 3. The second expression consists of the multiplication operation applied to two values, 6 and the result of combination (+ 4 5) . This inner combination is a sum of two values, 4 and 5, so the final value of the outer combination is 6 * 9, or 54. Note that the arithmetic operations, +, -, *, / can have an arbitrary number of parameters. They also take both integer and floating point numbers.

This is virtually all the syntax of Dylan. The language is fully parenthesized, which makes interpretation of expressions unambiguous. In contrast to Dylan (and Lisp languages in general), standard programming languages like Pascal have many more syntactic rules. At first, the large number of parentheses can appear confusing. Just use the simple rule of (operatoroperand ... operand) to break up any expression into its parts.

3. Evaluating Dylan Expressions

In Dylan, every expression has a value. The following simple rule is used to determine the value of a compound expression (an expression with multiple subexpressions enclosed in parentheses):

To evaluate a combination, first evaluate each of its subexpressions; then apply the operator (the value of the first item of the combination) to the values of the operands (the remaining items of the combination).

There are about half a dozen exceptions to this basic evaluation rule. The exceptions are called special forms, because they have their own special evaluation rules. Other than the few special forms, the above rule holds for determining the value of any Dylan expression. We will cover a number of special forms below. For example, the value of (+ 1 2) is obtained by evaluating each subexpression and then applying the addition operator to the values of the operands, 1 and 2. A slightly more complex example is the expression,

(+ (* 1 2) (+ 3 4)).

Its value is obtained by evaluating all the subexpressions and applying the operator to the operands. The first subexpression is +, which is atomic. The next subexpression is (* 1 2) , which is itself a combination. Thus, we recursively apply the evaluation rule to this expression: evaluate the subexpressions of (* 1 2) and apply the operator to the operands. Each subexpression, *, 1, and 2 is atomic, so we apply times to 1 and 2 to obtain 2 as the value of (* 1 2). Now we evaluate the third element of the original combination, which is also a combination. Again we recursively invoke the evaluation rule and, analogously to the previous case, obtain 7 as the value of (+ 3 4). Now we have three atomic values +, 2, 7 , which yields the result 9.

We can illustrate the process of evaluating this expression as follows

 (+ (* 1 2) (+ 3 4))
= (+ 2 7)
= 9

This example may seem tedious, since the value is relatively obvious from inspection. What is important here is to note the simple uniform rule that is applied to all expressions. This rule is the basic evaluation mechanism of all Dylan expressions, not just numerical expressions. The substitution model, described in Handout #4, provides a more formal description of the evaluation process.

4. Naming

Names in Dylan are typed, which means that a given name can only have certain kinds of values associated with it. We say that a name is bound when that name has been associated with a given value. There are two special forms define and bind that are used to bind names to values. A global variable is defined using define, and a local variable is defined using bind. (There are also some shorthands for common uses of define and bind which we will consider later). The other time that variables are bound is when functions are called, we will cover function definitions below.

The syntax of define is

(define (name type)expr)

define is a special form (i.e., it does not follow the normal evaluation rule) because it would be meaningless to try to evaluate name when we are trying to give it a value. Recall that the normal evaluation rule would be to determine the value of all of the subexpressions, but name in general has no value when we are defining it (and thus it would be an error to try to determine its value).

To evaluate a define expression, evaluate the last subexpression. If the type of that value matches the given type then bind the given name to that value.

For example, the sequence of expressions

 (define (x <number>) 10)
 (define (y <number>) (+ x 20))

binds x to 10 and y to 30. The names x and y are global variables (they exist "forever" --- which is until the interpreter is re-started). On the other hand the expression

(define (z <function>) (+ 10 20))

results in an error, because the value of (+ 10 20) is a number and thus does not match the specified type <function>. There are several basic types in the Dylan language, as well as the ability for users to add new types. By convention, the name of a type starts and ends with angle brackets. We will briefly discuss the pre-defined (or basic) types in the following section.

Local variables are bound using the bind special form. The syntax of bind is

 (bind (((name type) expr)
        ((name type) expr)
        ...)
    body ... body)

There are two parts to a bind expression, the bindings are a list of associations of names with values, and the body is a sequence of expressions to evaluate with the given bindings in effect (i.e., with these local variables defined). Each binding consists of a name name, a type specifier type, and an expression to evaluate expr. name is bound to the value of expr, as long as the type of that value matches type. The bindings take place in sequential order, that is each expr can refer to variables defined in any of the previous bindings.

So, the expression

 (bind (((y <integer>) 30)
        ((x <integer>) (+ y 1)))
       (* x 2))

yields 62. First y is bound to 30, then x is bound to 31, and finally (* x 2) is evaluated with that given value of x. The bindings of names to values specified by a bind only exist during the evaluation of the body forms, after that the bindings are no longer in effect. That is, a bind sets up local bindings that only exist for the given extent.

When bind's are nested, the names are searched in inside-out order. For example,

 (bind (((x <integer>) 10))
   (bind (((y <integer>) (+ x 2)))
     (bind (((x <integer>) 1))
       (+ x y))))

has the value 13, because the value of x in the body expression (+ x y) is 1 (it is the most recent binding of x). The value of y on the other hand is 12, because it is (+ x 2) but x has the value 10 at the point that y's value is bound.

5. Types

As noted above, names in Dylan are typed. There are a number of primitive or pre-defined types. Users can also add new types, but we will defer a discussion of user-defined types until a bit later in the semester. By convention, type names in Dylan begin with < and end with >, in order to make them distinctive in program texts. Types in Dylan are arranged in a hierarchy. The root of the type tree is the type <object>; in other words all values in Dylan are of type <object>. There are several types of <object>s in Dylan, including <function>s (which are the result of evaluating a method special form), <number>s, <string>s (anything enclosed in double quote), and <boolean>s (#t and #f). The different types of <number>s include <real>, <rational>, and <integer> among others. For instance, 1 and 1.0 are both of type <real>, but only 1 is of type integer.

The types lower in the tree are "more specific" in that they refer to a smaller class of values. For example, the <integer>s are a subset of the <number>s which are in turn a subset of all <object>s.

6. Control Structures

Conditionals, such as "if-then-else" are another case where the normal evaluation rules do not make sense (the first case that we saw which needed special forms was naming of variables using define and bind). The basic conditional in Dylan is if. Before we introduce it, let us discuss boolean values briefly. The symbols #t and #f stand for true and false, respectively. These primitive expressions are of type <boolean>. Any value other than #f is interpreted as being true when used in a conditional.

A number of relational operators are defined in Dylan. These include the usual numerical comparators as well as a number of equality tests. We will mention these as they arise. We will use = for numerical equality in most of our examples. Finally the usual boolean operators, and, or, not are also available.

Now let us introduce the conditional if. Its syntax is

(if test consequentalternate)

The evaluation rule for if is to evaluate test and then if this value is true to evaluate and return consequent otherwise evaluate and return alternate. For example,

(if (> x 0) 1 0)

evaluates to 1 for x being a positive number, else 0.

There is also a more general conditional, cond , with the form

 (cond (test conseq conseq ...)
       (test conseq conseq ...)
       ...)

Each test and its consequences is called a clause of the cond. The tests are evaluated in the order given. If no true test is found, the value of cond is undefined. If a true test is found, the consequences in that clause are evaluated in the order written, and the value of the cond is the value of the last expression in that clause. The symbol else: in a test is always true.

For example,

 (cond ((< x 0) -1)
       ((> x 0) 1)
       (else: 0))

evaluates to -1 if x is negative, 1 if x is positive and otherwise 0. The astute reader will have noticed the ability to have more than one expression (consequence) in a given cond clause. This should appear strange because an expression has only a single value. When there are multiple expressions in a cond clause, all expressions but the last are evaluated for their effect only, and the value of the last expression is the result of the cond. Since we rarely write Dylan procedures with side effects (e.g., changing the value of a variable), we rarely use more than one expression per clause. One case where we do use side effects frequently is printing to the display (i.e., it makes sense to execute a sequence of printing operations, or print and then return a value).

Similar to cond, there is a special form case, which we will not discuss here.

7. Functions

Functions are constructed in Dylan using the special form method. The syntax of the form is,

 (method ((param type) ... (param type))
     body ... body)

The method form creates a function (an executable piece of code) with the specified parameters, where the type of each parameter param is given by the corresponding type, and where the body (or code) of the function is a sequence of expressions.

For example,

(method ((x <number>)) (* x x))

creates a function of one parameter, a number x, that multiplies the given number by itself. Note that each parameter and its type are together in parentheses, and there is also a set of parentheses enclosing all of the parameters (in the above example there is only one parameter, thus the double parentheses around it). Simply creating a function, however, is not very useful. We would like to be able to use a function to do something. The most natural way to do that is to name the function, and then refer to it by its name. This can be done using a combination of define and method.

(define (square <function>) (method ((x <number>)) (* x x)))

Note that a function is a value just like any other value, and we associate that value with the name square just as we associated the value 10 with the name x in Section 4 on naming. This is quite different from Pascal or most other programming languages, where functions are special kinds of entities. In Dylan functions are named and passed as parameters just like any other object (e.g., a number).

Having defined a function and named it, the function can then be used just like any primitive function in the language. For example, the expression

(square (+ 1 2))

has value 9. Recall, to evaluate an expression, evaluate each of the subexpressions and apply the operator in first position to the remaining values. The value of square is a function, and the value of (+ 1 2) is gotten by applying the evaluation rule to this compound expression to obtain the value 3. Then the function (named square) is applied to 3. In order to apply a function, the parameters of the function are bound to the given values, and then the body of the function is evaluated with the new bindings. Thus, evaluating the function named square binds the parameter x to the given value 3, and then evaluates the body (* x x), the result of which is 9. This example is somewhat involved, and you probably will not understand it completely yet.

A function need not be named; it can simply be created and used immediately. The following expression has the same value as (square (+ 1 2)), but rather than using the name square, we create and use an anonymous (unnamed) function,

((method ((x <number>)) (* x x)) (+ 1 2))

Look carefully at how the parentheses are nested; the outermost pair of parentheses (the top level expression) contains two subexpressions,

(method ((x <number>)) (* x x)) and (+ 1 2)

Thus by the evaluation rule for combinations, the given (anonymous) function is applied to 3, which has the same effect as evaluating (square (+ 1 2)) (because square names a function that does the same thing).

Functions in Dylan may be defined recursively, without any special syntactic indications, and several functions may be defined mutually recursively, again without any special indications as in Pascal.

The following example is everyone's favorite,

;; n!

 (define (factorial <function>)
   (method ((n <number>))
     (if (= n 0) 1 (* n (factorial (- n 1))))))

This function takes a <number> n as an argument, and if its value is 0 returns 1, otherwise returns n times a recursive call to fact with n-1, yielding n * (n-1) * ... * 1=n! as a value. Note the only way that we know that this function is recursive is that it refers to itself in the body (there are no special declarations about recursive functions). Functions in Dylan can be parameters to other functions. Consider the three function definitions,

 (define (add-one <function>) (method ((n <number>)) (+ n 1)))
 (define (add-two <function>) (method ((n <number>)) (+ n 2)))

 (define (compose <function>)
   (method ((f <function>) (g <function>) (n <number>))
     (f (g n))))

Given these definitions, the value of the expression,

(compose add-one add-two 5)

is 8. The parameters f and g of compose are themselves functions that are applied to the third parameter, x. Thus the body of compose, which is (f (g n)), is executed with f being the function named by add-one, g being the function named by add-two, and n being the number 5. This results in the value 8. Notice that not only are f and g declared to be of type <function>, they are also used in the first position in a combination (which by the evaluation rule means that their values must be functions). [Note: this is a tricky example, you may want to read it again.]

Functions like compose, that have other functions as parameters or yield a function as a result, are called higher order functions. We will spend some time in a couple of weeks carefully understanding higher order functions.

Another example of a higher order function is twice, which applies a given function twice:

 (define (twice <function>)
   (method ((f <function>) (x <number>))
     (f (f x))))

The following expression then has the value 5:

(twice add-one 3)

Note that when functions are defined, not every identifier needs to be predefined (as would be the case in Pascal), but, they must be defined by the time the function is used. This allows mutually recursive functions to be written without forward declarations. When a function refers to variables that are not in its argument list, these variable bindings are obtained from the surrounding program text. For example, in the code

 (bind (((x <integer>) 1))
   (bind (((f <function>) (method ((u <number>)) (+ u x))))
     (bind (((x <integer>) 2))
       (f 0))))

the value of x in the function named f is 1, because that is the value of x where the method form is evaluated. When f is applied in (f 0), there is a different value of x, but that is not the one referred to by the name x in f. Thus the above expression yields 1 as a value.

What does the following expression evaluate to?

 (bind (((x <integer>) 1))
   (bind (((f <function>) (method ((u <number>)) (+ u x))))
     (bind (((x <integer>) 2))
       (f x))))

Note that when bind is used to bind a name to a function, that function cannot be recursive (it cannot refer to itself in the body of the function). This is because the names in a bind are only in effect in following bindings and in the body of the bind (recall the rule for bind from above).

There is also a form of bind specifically for binding functions. This form handles the issue, mentioned above, that recursive functions cannot be named with bind because the names are only defined in the following bindings and in the body. The bind-methods form is used to define local methods.

The syntax of bind-methods is,

 (bind-methods
   ((name ((param type) ...) expr ...)
    (name ((param type) ...) expr ...)
     ...)
   body ... body)

All of the names exist when the expr are evaluated, but the names do not have values until the body of the bind-methods. This allows functions to be defined that refer to themselves (or to other functions defined in the bind-methods) because the names do not need values until the functions are used. For example, the expression

 (bind-methods
   ((bar ((x <integer>)) (if (= x 0) (foo x) (bar (- x 1))))
    (foo ((x <integer>)) x))
   (bar 10))

has the value 0, because bar is called recursively until x becomes 0, and then foo is called with x = 0 and this value is returned by foo. The bind-methods form allows bar to call itself, and to call foo, whereas in a standard bind the names would not be defined until later bindings or the body.