(note: the notes are a little out of synch with what I actually presented in class.) Today: data structures and more on higher-order procedures Thus far we've seen: * integers, numbers, etc. * functions (methods) But we haven't seen any data structures (e.g., records or other kinds of objects that can contain data.) Today I want to convince you that at least in principle, we don't need anything besides (higher-order) functions. Instead, we can encode data structures as functions. Consider the most primitive data structure -- the pair. Intuitively, a pair holds two objects. We need a way to create pairs and a way to access their components. (make-pair V1 V2) * should somehow implement a pair of values V1 and V2 (first (make-pair V1 V2)) * should return the value V1 (second (make-pair V1 V2)) * should return the value V2 We can express these requirements as axioms (i.e., equations) with the intended meaning that the left-hand-side of the equation should evaluate to the same thing as the right-hand-side. (first (make-pair V1 V2)) = V1 (second (make-pair V2 V2)) = V2 Dylan provides a built in mechanism for building pairs, but I want to show you that we don't really need it -- we can use higher-order functions to define make-pair, first, and second. First we define make-pair. The intuition is that we'll represent a pair (V1,V2) as a function f, which when given a selector function g, calls the selector passing it both V1 and V2. The selector will be responsible for returning either V1 or V2. (define (make-pair ) (method ((x ) (y )) (method ((selector )) (selector x y)))) Notice that by the substitution model: (make-pair 3 4) -> {proc ((selector )) (selector 3 4)} Now we define first. Here, the idea is that we take in a pair p (which is a function like the proc above). This function expects us to call it with a selector function. In turn, the selector function will be called with the two values in the "pair" (hidden within the proc p). In the case of first, we want to return the first component. So, we pass to the pair a selector function which takes the two values and returns the first one: (define (first ) (method ((p )) (p (method ((x ) (y )) x)))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is the selector function we pass to p In the case of second, we want to return the second component. So, we pass to the pair a selector function which takes the two values and returns the first one: (define (second ) (method ((p )) (p (method ((x ) (y )) y)))) Notice that the only difference between first and second is that the selector functions we pass in to the pair are slightly different. Now consider evaluation of: (first (make-pair 3 4)) (where I'm leaving off the type information to make things a bit easier to write down): (first (make-pair 3 4)) substitute the definition of first to get: -> ((method (p) (p (method (x y) x))) (make-pair 3 4)) evaluate the method to a value (i.e., a proc): -> ({proc (p) (p (method (x y) x))} (make-pair 3 4)) now evaluate (make-pair 3 4) to a value -- we know what this yields from above -> ({proc (p) (p (method (x y) x))} {proc ((selector )) (selector 3 4)}) now do an application substituting the result of (make-pair 3 4) for p within the body of the first function: -> [{proc (p) (p (method (x y) x))} {proc (selector) (selector 3 4)}] -> ({proc (selector) (selector 3 4)} (method (x y) x)) the proc is already a value, but we must evaluate the method to a value: -> ({proc (selector) (selector 3 4)} {proc (x y) x}) another application: [{proc (selector) (selector 3 4)} {proc (x y) x}] -> ({proc (x y) x} 3 4) and 3 and 4 are already values (though I've left off the braces) so we do yet another application, substituting 3 for x and 4 for y: [{proc (x y) x} 3 4] -> 3 So, as we hoped, (first (make-pair 3 4)) returns 3. And it's very easy to see that (second (make-pair 3 4)) would return 4 since the only difference is in the selector function which returns y instead of x. In fact, we can show that for any two values V1 and V2, the axioms above regarding make-pair, first, and second are satisfied by this set of definitions. As I said earlier, Dylan provides built-in mechanisms for building and destructuring pairs: (pair V1 V2) -- creates a pair of the values (like our make-pair) (head p) -- returns the first component of a pair (like our first) (tail p) -- returns the second component of a pair (like our second) Intuitively, (pair V1 V2) returns a pointer to a "pair of boxes": +---+---+ -->| * | * | +-|-+-|-+ v v V1 V2 We can string boxes together as follows: +---+---+ +---+---+ +---+---+ +---+---+ -->| * | *----->| * | *---->| * | *---->| * | *--->'() +-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+ v v v v V1 V2 V3 V4 in order to form a list. Note that the list is terminated by having a pair where the second component is '() (null). We'll talk more about lists in sections. Because we can build lists out of pairs, and those lists can point to other pairs, we can build almost any data structure we like (i.e., lists, trees, dags, etc.) with just pairs. Historically, pair, head, and tail were called: pair --> cons head --> car tail --> cdr In the early development of Lisp on old IBM machines (I think the 704), memory words were divided into two pieces: the address registers and the decrement register: +------------+------------+ | address | decrement | | register | register | +------------+------------+ sort of like our boxes above for pairs. Hence, it was convenient to represent pairs using a single machine word. The machine provided two instructions: car: contents of address register cdr: contents of decrement register Cons stood for "construct". The Lisp designers adopted this terminology in their language. Of course, the machine (and it's instructions) are long gone now, so I prefer to use the more intuitive (and portable) names of pair, head, and tail. You can use either set.