Recitation 19: Vector, Array, and Array2; Comparison of environment and substitution models

  1. Environment model
  2. Vectors, Arrays, and Array2
    1. Selected operations

Environment model

This writeup covers only some high-level issues in comparing the substitution and environment models. To avoid excessive overlap with the lecture notes, I have omitted the details of the environment model; a rough set of notes covering those topics are available upon request request.

The key difference between the environment model and the substitution model is that the environment model never substitutes a value for a variable, except when the variable expression (e.g., x) is itself evaluated. Environments, consisting of mappings of variables to values, are consulted for the current value bound to that variable.

The environment model provides many advantages over the substitution model.

  1. Only one recurrence needs to be defined. With environment model, you only need to define evaluate(). With the substitution model, you need to define both evaluate() and substitute(). This can become difficult if you add new syntactic constructs to the language.
  2. The environment model is closer to how real interpreters work. Implementing an interpreter straight from the mathematical description of the substitution model results in an interpreter that processes some tree nodes multiple times, once for each substitution. This can be very slow. With the environment model, tree nodes are only traversed a single time each time a particular expression is evaluated (nodes corresponding to a function body are visited once per function invocation).
    The environment model fits closely with how the computer works at the low level: a variable is the name of some memory location, which is used to store the value of the variable. Each time you access that variable, the value is derived from the memory location. Substitution does not have a natural correspondence with a low level operation.
  3. The environment model interacts more cleanly with how functions are supposed to work. Remember that function bodies are not supposed to be evaluated until they are applied. The substitution model cheats: substitution goes into the function body at declaration time to fix up the free variables (relative to the formal parameters) of the function, but the evaluator does not enter the body until application time.
  4. As you will see later in the class, it is quite easy to add refs and side effects to the environment model. These changes are difficult to implement in the substitution model.

Vectors, Arrays, and Array2

So far, none of the SML collection data structures we have considered provide O(1) access to all elements. O(1) access is essential for efficiently implementing many algorithms. Lists require O(n) time to access arbitrary elements, and binary search trees require O(log n) time.

Arrays provide O(1) access to elements in a collection in other programming languages, such as Java. SML provides arrays through the Vector, Array, and Array2 basis library structures. Vector and Array each implement one dimensional arrays, and provide very similar interfaces, while Array2 implements two dimensional arrays. Vector is an immutable, functional data structure, whereas Array and Array2 are mutable.

Note: Equality with mutable structures such as Array and Array2 is defined in the same way as for refs: two values are = if and only if they were created from the same constructor. For instance

  val x = Array.fromList([1,2,3,4]);
  val y = Array.fromList([1,2,3,4]);
  val z = x
  val a = x = y (* false *)
  val b = x = z (* true *)

Selected operations

  fromList(list : 'a list) : 'a vector
  fromList(list : 'a list) : 'a array

Creates a Vector or Array from the provided list.

  sub(vec : 'a vector, a: int) : 'a
  sub(arr : 'a array, a: int) : 'a

Returns the kth value in the Vector or Array

  update(vec : 'a vector, k : a, newValue : 'a) : 'a vector
  update(arr : 'a array, k : a, newValue : 'a) : unit

Updates the kth element of the array with the new value. Note that Vector.update returns a new Vector, which is identical to the original vector save for the kth element. Array.update() mutates the existing vector, and so returns no value.

Other interesting related structures include VectorSlice and ArraySlice. Slices are derived data structures associated with a particular range of elements in an underlying array or vector, and support a similar set of operations, but on a restricted range.

Arrays, vectors, and Array2s support the same higher-order operations (map, fold, app, find, exists, all) as lists. There are also "indexed" variants of map, fold, app, find, known as mapi, foldi, appi, findi. These variants provide the function to apply with the current index, in addition to the actual value of that index. Thus, the signature of foldli is:

  val foldli : (int * 'a * 'b -> 'b) -> 'b -> 'a slice -> 'b

which requires the function to fold take the array index as a parameter. The function can use the array index for operations such as determining the parity (even/oddness) of the current element or looking at the neighboring elements in the array.

Why doesn't foldl provide this functionality? One reason might be to discourage programmers from writing bad code: nth can become tempting if you are provided with the index, whereas sub with a computed index is efficient.

Array2 is a two-dimensional array. Like Array, it is mutable. Array2 is well-suited for implementing matrices. Like lists, arrays, and vectors, high-order functions such as map, app, and fold are defined for Array2; as in the other array structures, indexed versions are offered. Since Array2 is two-dimensional, there are more ways to define traversal than for one-dimensional aggregates like lists, vectors, and arrays. Hence, the traversal order must be specified. The options are row-major and column-major order. In row-major order, the elements are returned row by row, from left to right within each row, then top to bottom. In column-major order, elements are returned column by column, from top to bottom within each row, then left to right.

Row major order

1234
5678
9101112
13141516

Column major order

15913
261014
371115
481216