CS212 Fall 1998
Pairs & Lists


  • Pairs
  • Lists
  • Quotation
  • Procedures on Lists
  • Reasoning about Lists
  • Equality of Lists

  • Pairs

    A pair is an ordered collection of two objects, called the head and the tail of the pair. Dylan has a built-in type <pair> for pairs. A pair with head x and tail y can be created with the operator pair:

       (pair x y)
    

    Pair is a regular function, not a special form. Therefore x and y are evaluated before creating the pair. The head and tail can be accessed using the primitive operators head and tail:

       (head (pair x y)) = x
       (tail (pair x y)) = y
    

    When printing a pair, Dylan prints the head and tail from left to right, enclosed in parentheses and separated by a period:

    ? (pair 4 5)
    ==> (4 . 5)
    

    Lists

    A list is an ordered sequence of zero or more objects. Dylan has a type <list> for lists. A list of any number of elements can be constructed using the operator list. The elements are evaluated before the list is constructed. When printing a list, Dylan prints the elements in order from left to right, enclosed in parentheses and separated by spaces:

    ? (list 1 2 3)
    ==> (1 2 3)
    
    ? (list 5 5 5 5 5)
    ==> (5 5 5 5 5)
    

    There is a special list () called the empty or null list, which is the unique list with no elements. This is just an object in Dylan with a special internal representation that identifies it as the null list. The exact form of the internal representation need not concern us. The null list can be specified by typing '() or (list). In the first form, the quote ' tells Dylan not to try to evaluate (). Quote can be used more generally, and we will discuss it more later.

    The null list has its very own type <empty-list>. You can test whether an object is the null list with the Dylan primitive operator null?:

    ? (null? '())
    ==> #t
    
    ? (null? (list 1 2 3))
    ==> #f
    

    Nonempty lists are built by repeated pairing. Thus lists are built with pair, as opposed to defining an independent data type. This is an historical artifact that dates back to Dylan's grandparent language Lisp, and can make things a bit confusing at first.

    A list with n+1 elements is a pair whose head is the first element of the list and whose tail is the rest of the list, i.e. a list of n elements. The key thing to remember is that the tail of a nonempty list is always a list. The functions head and tail are used to access the first element and the rest of the list, respectively, and the function pair can be used to append a new element to the front of an existing list.

    ? (head (list 1 2 3 4))
    ==> 1
    
    ? (tail (list 1 2 3 4))
    ==> (2 3 4)
    
    ? (head (list 5))
    ==> 5
    
    ? (tail (list 5))
    ==> ()
    
    ? (list (list 1 2) (list 3 4))
    ==> ((1 2) (3 4))
    
    ? (head (list (list 1 2) (list 3 4)))
    ==> (1 2)
    
    ? (tail (list (list 1 2) (list 3 4)))
    ==> ((3 4))
    
    ? (pair 5 (list 1 2 3 4))
    ==> (5 1 2 3 4)
    
    ? (pair 5 '())
    ==> (5)
    
    ? (pair 1 (pair 2 (pair 3 '())))
    ==> (1 2 3)
    

    The type <list> is simply the union of the types <pair> and <empty-list>.

    Note that with this representation of lists, head gives an element of the list, but tail gives a sublist. This asymmetry is often a source of great confusion. Just remember that pair is not symmetric when used with lists. The first argument is an element that is being appended to the front of the list given in the second argument.

    Also, this may be obvious, but it is still worth saying: a single element is not the same thing as the list containing that element.

    ? 5
    ==> 5
    
    ? (list 5)
    ==> (5)
    
    ? (list (list 5))
    ==> ((5))
    

    The expression (pair 1 (pair 2 3)) evaluates to something that is known as a ``dotted tail''.

    ? (pair 1 (pair 2 3))
    ==> (1 2 . 3)
    

    This is not really a list, because 3 is not a list, thus (pair 2 3) is not a list, thus the entire expression is not a list. Remember: the tail of a nonempty list is a list. Although this expression is not a list, it is an instance of the Dylan type <list>, since it is a <pair>, and all<pair>'s are <list>'s.

    Here is an example to help you understand the difference between pair (append an element to the front of a list) and list (construct a list out of zero or more elements).

    ? (pair 1 (list 2 3))
    ==> (1 2 3)
    
    ? (list 1 (list 2 3))
    ==> (1 (2 3))
    

    The first expression appends the element 1 to the beginning of the list (2 3), yielding a list of three elements. The second expression makes a list of two elements, the first 1, and the second the list (2 3).

    Quotation

    In constructing list structures, the issue arises of how to specify symbols rather than their values. For example, we have no way of producing the list that prints as (+ * -). The expression

       (list + * -)
    

    will produce a list with the values of these symbols, which are procedure objects, not the symbols themselves. This is because list is not a special form, so the standard evaluation rule applies: the subexpressions are evaluated first. One possible approach would be to have a special form that builds a list without evaluating its arguments, but instead we take a more general approach.

    The underlying issue is how to distinguish a value from the thing that denotes that value. This is an issue in natural language as well as in programming languages. For example, consider the two sentences

    The first sentence is a request to say your name (the value denoted by the phrase ``your name''). The second sentence is a request to say ``your name'' (the actual phrase, rather than the value it denotes). Dylan has a notion of quotation, which allows us to refer to literal expressions without evaluating them. A single quotation mark ' before a Dylan expression means the literal (unevaluated) expression rather than the value it denotes. We saw the use of this above with the null list. We can also use the special form quote.

    For example, consider the following sequence of expressions and their values:

    ? +
    ==> {Method : + ()}
    
    ? '+
    ==> +
    
    ? (quote +)
    ==> +
    
    ? (define (x <number>) 3)
    
    ? x
    ==> 3
    
    ? 'x
    ==> x
    
    ? (list + x x)
    ==> ({the generic function + (n1::<number>,n2::<number>)} 3 3)
    
    ? (list '+ 'x x)
    ==> (+ x 3)
    
    ?  '(+ x x)
    ==> (+ x x)
    
    ? (+ x x)
    ==> 6

    In each case, the quotation mark takes the entire following expression (either a symbol or a compound expression between parentheses) and produces the value that prints the same way as the expression following the quote (the literal value). This contrasts sharply with what would normally happen if the expression were evaluated.

    Procedures on Lists

    There are many useful procedures for manipulating lists. The function length counts the number of elements in a list:

    (define (length <function>)
      (method ((l <list>))
        (if (null? l)
            0
            (inc (length (tail l))))))
    
    ? (length '(1 2 3))
    ==> 3
    
    ? (length '(1 (2 3) 4))
    ==> 3
    
    ? (length '())
    ==> 0
    

    Another very useful function is map, which applies a given function to every element of a list, and returns a list of the values:

    (define (map <function>)
      (method ((f <function>) (l <list>))
        (if (null? l)
            '()
            (pair (f (head l))
                  (map f (tail l))))))
    
    ? (map head '((1 2) (3 4) (5)))
    ==> (1 3 5)
    
    ? (map tail '((1 2) (3 4) (5)))
    ==> ((2) (4) ())
    
    ? (map length '((1 2) (3 4) (5)))
    ==> (2 2 1)
    

    The functions first, second, and third pick out the first, second, and third elements of a list, respectively; first is a synonym for head, and second and third can be defined by

    (define (second <function>) (compose first tail))
    (define (third <function>) (compose second tail))
    

    The function append can be used to stick two lists together end-to-end:

    (define (append <function>)
      (method ((l1 <list>) (l2 <list>))
        (if (null? l1)
            l2
            (pair (head l1) (append (tail l1) l2)))))
    
    ? (append '(1 2 3) '(3 4))
    ==> (1 2 3 3 4)
    

    Reasoning about Lists

    Having defined lists and some simple procedures for manipulating them, we now turn to some tools for reasoning about them. Consider the following simple procedure for copying a list:

    (define (copy <function>)
      (method ((l <list>))
        (if (null? l)
            '()
            (pair (head l) (copy (tail l))))))
    

    How would we go about showing that for all lists l, (copy l) = l? Clearly this is a job for induction and the substitution model, but the inductions that we have done so far have all been over the natural numbers. We can either turn this into an induction on the natural numbers (i.e., induction on the length of the list, which is a natural number), or induction on the list structures themselves. We will go over the first method here and discuss the second method in a later handout on structural induction.

    Let's write h.l for the pair with head h and tail l, i.e. the result of evaluating (pair h l), and () for the empty list, i.e. the result of evaluating '().

    We wish to show that for all lists l of length n, (copy l) = l. We prove this by induction on n.

    Basis n = 0.
    In this case l = (), since this is the only list of length 0. Using the substitution model,
    (copy l)
    ==> ((method ((l <list>))
           (if (null? l) '() (pair (head l) (copy (tail l))))) l)
    ==> (if (null? l) '() (pair (head l) (copy (tail l))))
    ==> ()
    
    Induction step n > 0.
    Assume that for any list l of length n, (copy l) = l. We wish to show the same for any list l1 of length n+1. Let h and l2 be the head and tail of l1, respectively. Then l1 = h.l2, and l2 is of length n. By the induction hypothesis, (copy l2) = l2.

    We know that (null? l1) = #f because the length of l1 is m+1, which is greater than zero. Using the substitution model,

    (copy l1)
    ==> ((method ((l <list>))
           (if (null? l) '() (pair (head l) (copy (tail l))))) l1)
    ==> (if (null? l1) '() (pair (head l1) (copy (tail l1))))
    ==> (pair (head l1) (copy (tail l1)))
    ==> (pair h (copy l2))
    ==> (pair h l2)   <-- by the induction hypothesis
    ==> h.l2
    = l1
    

    Equality of Lists

    Finally, we turn briefly to the issues surrounding the equality of list structures. There are two notions of equality: a stronger notion tested by the Dylan procedure id?, which tests whether two structures are identical (i.e., point to the same object in memory); and a weaker notion tested by the Dylan procedure =, which tests whether two structures "look alike" (i.e., print the same way).

    For lists, the primitive = could be defined in terms of id? as follows:

    (define (= <function>)
      (method ((l1 <list>) (l2 <list>))
        (or (id? l1 l2)
            (and (instance? l1 <pair>)
                 (instance? l2 <pair>)
                 (= (head l1) (head l2))
                 (= (tail l1) (tail l2))))))

    If both l1 and l2 are the same symbol, or both are the empty list, then the first clause of the or will be true, therefore the procedure will return #t without evaluating the second clause. Otherwise, if they are both pairs, then we recursively check that their heads and tails are equal. Anything else means that the structures are not the same, so we return #f.

    In the second clause of the or, the test is

    (and (instance? l1 <pair>)
         (instance? l2 <pair>)
         (= (head l1) (head l2))
         (= (tail l1) (tail l2)))
    

    The order of the four tests is significant. The reason for structuring the code this way is that we don't want the recursive calls to occur unless both l1 and l2 are pairs. We are making use of the fact that and evaluates its arguments in the order given and returns #f as soon as any one of them evaluates to #f without evaluating the remaining arguments.

    To help understand = versus id?, consider the following examples:

    ? (define (x <list>) '(a b c))
    
    ? (define (y <list>) x)
    
    ? (define (z <list>) '(a b c))
    
    ? (id? x y)
    ==> #t
    
    ? (id? x z)
    ==> #f
    
    ? (id? y z)
    ==> #f
    
    ? (and (= x y) (= x z) (= y z))
    ==> #t
    

    It is instructive to think about equality of lists produced using the procedure append defined above. Suppose we type

    ? (define (u <list>) (append x y))
    
    ? (define (v <list>) (append x y))
    

    What then would be the result of evaluating (id? u v)? ...(= u v)?


    CS212 home page


    Last Modified: 9/16/98 by dck