Recitation 8: Data structures, folding and currying

Over the past few weeks, we have seen many examples of data structures such as lists, dictionaries, stacks and queues which keep an arbitrary number of values in some kind of order. For all of those structures, we have (or can) define functions 'map' and 'app' to walk over the structures. For example, we have seen the functions:

List.map : ('a -> 'b) -> 'a list -> 'b list
Stack.map : ('a -> 'b) -> 'a stack -> 'b stack
Queue.map : ('a -> 'b) -> 'a queue -> 'b queue

These functions all apply a function (called a transformation function) of type 'a -> 'b over the list/stack/queue elements. For lists, for example, map f [x1,...,xn] produces a list [f (x1), ..., f (xn)]. The class of functions 'app' apply a function of type 'a -> unit for its side effects to all the elements of the data structure, in some specific order.

It turns out that such data structures often support a much more general operation called ``folding'', from which many higher-order functions can be derived automatically, including map and app, but also functions such as filter.

Folding is actually supported through a pair of operations, foldl and foldr (which stand for fold-left and fold-right). They all have type:

(('a * 'b) -> 'b) -> 'b -> 'a list -> 'b (for lists)
(('a * 'b) -> 'b) -> 'b -> 'a stack -> 'b (for stacks)
(('a * 'b) -> 'b) -> 'b -> 'a queue -> 'b (for queues)

We only discuss folding for lists in this recitation, but it should be clear that whatever we say for lists can be said for stacks, queues, or any other structure which supports folding.

Intuitively, folding applies a given operation to the elements of a list. Given a function f of type ('a * 'b) -> 'b, the expression

foldr f b [x1,x2,...,xn]

evaluates to

f (x1,f (x2,f (...,f(xn,b))))

and

foldl f b [x1,x2,...,xn]

evaluates to

f (xn,f (..., f (x2,f (x1,b))))

Here are the implementations: [NOTE TO INSTRUCTOR: it may be worthwhile to point out the differences between foldl and foldr in the code, as it relates to the above]]

fun foldl (f: ('a * 'b) -> 'b) (base : 'b) (l: 'a list) : 'b =
  case l of
    [] => b
  | x::xs => foldl f (f (x,base)) xs

fun foldr (f: ('a * 'b) -> 'b) (base : 'b) (l: 'a list): 'b =
  case l of
    [] => b
  | x::xs => f (x,foldr f base xs)

A lot of natural operations on lists can be implemented using folding. For example, getting the length of a list can be written as

fun length (l:'a list):int =
  foldl (fn (x:'a,len:int) => 1+len) 0 l

Similarly, computing the sum of a list of integers can be written

fun sum (l:int list):int =
  foldl (fn (x:int,sum:int) => x+sum) 0 l

Actually, in the above, both a foldl and a foldr can be used, since the operation + is commutative (!). We can implement map using folding (notice the foldr!):

fun map (f:'a -> 'b) (l:'a list) :'b list =
  foldr (fn (x:'a,r:'b list) => f (x)::r) [] l

and app as well:

fun app (f:'a -> unit) (l:'a list): unit =
  foldl (fn (x:'a,_:unit) => f (x)) () l

Filtering a list, that is keeping all the elements of a list for which a given predicate function returns true, can also be easily
implemented:

fun filter (p:'a -> bool) (l:'a list):'a list =
  foldr (fn (x:'a,r:'a list) => if p (x) then x::r else r) [] l

Even reversing a list can be implemented using a simple fold:

fun rev (l:'a list):'a list =
  foldl (fn (x:'a,xs:'a list) => x::xs) [] l

(which can be simply written as 'foldl (op ::) [] l' )

Folding can be used to derive most higher-order functions that are often used. For many data structures, it is sufficient to provide folding functions over the structure to be able to derive functions such as map, app, filter, and others. In a precise sense, folding functions act as ``iterators'' over the elements of the data structure.

In a world where one programs with higher-order functions and folding, curried functions become more and more important. Consider the function Int.fmt, of type StringCvt.radix -> int -> string, which converts an integer to a string in the given radix representation (decimal, octal, binary, hexadecimal). Suppose we have a non-curried version of the function:

fun fmtNC (r:StringCvt.radix,i:int):string = Int.fmt r i

Suppose we wanted to convert a list L of integers to binary with such a function. Then we would use a map function as follows:

map (fn (x:int) => fmtNC (StringCvt.BIN,x)) L

whereas using the curried form, we get the more satisfying:

map (Int.fmt StringCvt.BIN) L

Notice how we do not have to define a new anonymous function in the second example, as it is automatically created by the currying process. As another example, consider the function String.isPrefix, of type string -> string -> bool. The call String.isPrefix s1 s2 returns true if s1 is a prefix of s2, and false otherwise. For the sake of arguments, suppose we have a non-curried version:

fun isPrefixNC (s1:string,s2:string):bool = String.isPrefix s1 s2

Suppose we have a list L of URLs from which we want to keep only the ftp sites (starting with ftp://), using the filter function above. If we want to use the non-curried function, we have to write:

filter (fn (x:string) => isPrefixNC ("ftp://",x)) L

whereas with the curried function, the form is much simpler:

filter (String.isPrefix "ftp://") L

In summary, using curried functions with higher-order functions often saves us from having to introduce spurious 'fn' to construct functions that simply perform bookkeeping on the arguments.