Lecture 19:
Streams with Side Effects

The streams that we have discussed up to now all suffer from the problem that they do not have "memory" - they do not remember that they "gave out" certain values.

Before we illustrate this point, let us repeat the stream definition that we have used previously:

exception Empty 

datatype 'a stream = Null | Cons of 'a * (unit -> 'a stream)

Let us now consider a function that generates an infinite stream by circularly reusing the values from a regular list:

fun circular(l: 'a list): 'a stream =
  case l of 
    []   => Null
  | h::t => Cons(h, fn () => circular (t @ [h]))

Here is an example that illustrates how circular works:

- val sc = circular ["one", "two", "three", "four"];
val sc = Cons ("one",fn) : string stream

We have previously defined function takeN, which takes a stream s and an integer parameter n, and returns the first n values (if available) from s:

fun takeN(s: 'a stream, n: int): 'a list =
  case (s, n) of
    (_, 0) => []
  | (Null, _) => raise Empty
  | (Cons(h, t), n) => h :: (takeN(t(), n-1))

Now, what happens if we twice remove five values from stream sc?

- takeN(sc, 5);
val it = ["one","two","three","four","one"] : string list
- takeN(sc, 5);
val it = ["one","two","three","four","one"] : string list

The stream does not "remember" that it already gave out the first values; when function takeN uses it again, it produces exactly the same sequence of values as the first time. It is easy to see that this is a characteristic of all the streams that we have written up to now, not only of circular streams.

It might be that this is the semantics that we are interested in, but in many cases we will want to have streams that never give out the value in the ith position of the stream twice. Note that this property does not preclude the same value to be produced several times, but this is only possibly if the respective value appears several times in the stream.

One solution to our problem is to rewrite function takeN so that it returns not only the list of values it extracted from the stream, but also the "leftover" stream. Here one such possible rewrite:

fun takeN(s: 'a stream, n: int): 'a list * 'a stream =
  case (s, n) of
    (_, 0) => ([], s)
  | (Null, _) => raise Empty
  | (Cons(h, t), n) => let
                         val (l, s') = takeN(t(), n-1)
                       in
                         (h::l, s')
                       end

Now we can recover the stream after takeN has removed all the needed elements from it. We have to be careful, however, to retain the returned stream, as below:

- val (l2, sc2) = takeN(sc, 5);
val l2 = ["one","two","three","four","one"] : string list
val sc2 = Cons ("two",fn) : string stream
- val (l3, sc3) = takeN(sc2, 5);
val l3 = ["two","three","four","one","two"] : string list
val sc3 = Cons ("three",fn) : string stream

But what if we want to implement streams and takeN so that it is the streams' "duty" to remember the values that it gave out? We can not do this in a purely functional context, but side effects will help us out. Here is one possible solution for the stream of natural numbers:

local
  val count: int ref = ref ~1;
in
  fun nats(): int stream =
    (count := (!count) + 1; Cons(!count, nats))
end

The only role of the local statement is to restrict the scope of lst's declaration to the part of the program between the in and end keywords of the respective statement. Such restriction of visibility is useful to prevent other parts of the program to access lst and examine/modify its value. Statement local allows us to achieve an effect similar to that of declaring a private variable in a C++/C#/Java class.

Consider now the following example, which uses the original version of takeN (the version that returns the list of values only):

- takeN(nats(), 5);
val it = [0,1,2,3,4] : int list
- takeN(nats(), 5);
val it = [6,7,8,9,10] : int list

It appears that we have achieved our goal - the stream has memory and it will not give out the same value twice. We now employ the same idea to redefine circular:

local
  val lst: int list ref = ref []
in
  fun setStream(l: int list) = lst := l
  fun getList() = lst
  fun circular(): int stream =
    case !lst of
      [] => Null
    | h::t => (lst := (t @ [h]);
               Cons(h, circular))
end

Our stream can not be used directly by calling circular; we must first set up the list of values that will be used to produce the stream. Once we perform this initialization step, we can then use circular:

- setStream [9, 8, 7, 6, 5, 4, 3, 2];
val it = () : unit
- val sc = circular();
val sc = Cons (9,fn) : int stream
- takeN(sc, 3);
val it = [9,8,7] : int list
- takeN(sc, 3);
val it = [9,5,4] : int list

The first call to takeN seems to work well, but the second one returns incorrect results; we would have expected to get [6, 5, 4]. What is going on?

After some further experimentation we conclude that every call to takeN will return 9 in the first position of the list. With the help of the getList function we can examine the value of lst; for example, this value is ref [5,4,3,2,9,8,7,6] after the first call to takeN. Despite the fact that 5 is the first in the list, the second call to takeN will still return 9 in the first position.

If we examine the definition of circular trying to explain what is happening, we notice that h and t are evaluated at the time of the function call. These two values are determined at the time when circular is called (i.e. when sc is defined), and not at the time takeN is called. This, in turn, implies that the value of h that holds at the time circular is called will be stuck in the first position of the resulting stream. Subsequent calls to takeN will always find this value (in our case 9) in the first position. Our stream has memory, but its memory is flawed.

If you have now become suspicious of the stream of natural numbers, you are right:

- val naturals = nats();
val naturals = Cons (0,fn) : int stream
- takeN(naturals, 5);
val it = [0,1,2,3,4] : int list
- takeN(naturals, 5);
val it = [0,6,7,8,9] : int list

The stream of natural natural numbers also has flawed memory! The problem is caused, again, by the fact that the stream of natural numbers "freezes" in 0 as the first value in the stream at the time of the call to nats.

So what can we do? One solution is to delay the evaluation of the head of the stream until the value is actually needed. Our original streams are "lazy" before in that they did not evaluate their tail fully (as we noted before, this evaluation would never finish for an infinite stream). Our new, upgraded streams will be even more lazy; they will never evaluate a value unless it is needed.

Super-Lazy Streams

We now proceed with the new definition of streams:

datatype 'a stream = Null | Cons of (unit -> 'a) * (unit -> 'a stream)

We need a slightly modified definition of takeN:

fun takeN(s: 'a stream, n: int): 'a list =
  case (s, n) of
    (_, 0) => []
  | (Null, _) => raise Empty
  | (Cons(h, t), n) => (h()) :: (takeN(t(), n-1))

Let us now define the stream of naturals:

local
  val count: int ref = ref ~1;
in
  fun nats(): int stream =
    Cons(fn () => (count := (!count) + 1; !count), nats)
end

Does this change solve our problem?

-  val naturals = nats();
val naturals = Cons (fn,fn) : int stream
- takeN(naturals, 5);
val it = [0,1,2,3,4] : int list
- takeN(naturals, 5);
val it = [5,6,7,8,9] : int list

Apparently, yes. What about circular?

local
  val lst: int list ref = ref []
in
  fun setStream(l: int list) = lst := l;
  fun circular(): int stream =
    if null (!lst)
    then Null
    else Cons(fn () => case !lst of
                           [] => raise Fail "internal error"
                         | h::t => (lst := t @ [h]; h),
              fn () => circular())
end

We can now test our new implementation:

- setStream [9, 8, 7, 6, 5, 4, 3, 2];
val it = () : unit
- val sc = circular();
val sc = Cons (fn,fn) : int stream
- takeN(sc, 3);
val it = [9,8,7] : int list
- takeN(sc, 3);
val it = [6,5,4] : int list

Apparently, circular streams work as well.

There is, however, yet one more subtle problem with our new streams. Consider the following example:

- val n1 = nats();
val n1 = Cons (fn,fn) : int stream
- val n2 = nats();
val n2 = Cons (fn,fn) : int stream
-
- takeN(n1, 4);
val it = [0,1,2,3] : int list
- takeN(n2, 4);
val it = [4,5,6,7] : int list
- takeN(n1, 4);
val it = [8,9,10,11] : int list

In the example above we have defined two streams of integers by calling nats twice. The two streams are not independent, however, but they are entagled: each natural number will be produced exactly once either in one stream, or the other. This implies that stream n1 will have gaps in the sequence of its values that depend on the values that have been extracted from stream n2 (and viceversa). It is possible that on occasion we will find such a semantics useful. In most cases, however, it is likely that we want to define independent streams whose sequence of values do not depend on each other.

But how can we achieve this? The entanglement of streams is ultimately due to the fact that they use the same variable to store information related to their state. If we could create a situation in which each stream had its own independent "memory," then our streams would become independent.

Here is one possible solution for the stream of natural numbers:

fun nats(): int stream = 
  let
    val count: int ref = ref ~1;
    fun helper(): int stream =
      Cons(fn () => (count := (!count) + 1; !count), helper)
  in
    helper()
  end

By defining function nats we created a "shell" around helper. The main role of this "shell" is to allow for the definition of count inside the body of nats. Each function call to nats will generate an instance of count, which will serve as local memory for the respective stream of integers. The details of this process will be understood better when we discuss the environment model.

- val n1 = nats();
val n1 = Cons (fn,fn) : int stream
- val n2 = nats();
val n2 = Cons (fn,fn) : int stream
-
- takeN(n1, 4);
val it = [0,1,2,3] : int list
- takeN(n2, 4);
val it = [0,1,2,3] : int list
- takeN(n1, 4);
val it = [4,5,6,7] : int list

Can we fix circular? Yes, we can:

fun circular(l: int list): int stream = 
  let
    val lst: int list ref = ref l
    fun helper(): int stream = 
      if null (!lst)
      then Null
      else Cons(fn () => case !lst of
                           [] => raise Fail "internal error"
                         | h::t => (lst := t @ [h]; h),
                fn () => helper())
  in
    helper()
  end

And here is a demonstration that the circular streams we are generating are independent:

- val sc = circular [9, 8, 7, 6, 5, 4, 3, 2];
val sc = Cons (fn,fn) : int stream
- takeN(sc, 3);
val it = [9,8,7] : int list
- val sc2 = circular [9, 8, 7, 6, 5, 4, 3, 2];
val sc2 = Cons (fn,fn) : int stream
- takeN(sc2, 3);
val it = [9,8,7] : int list
- takeN(sc, 3);
val it = [6,5,4] : int list