Problem Set 2 Even More SML

Issued: January 29
Due: February 10, 11:59pm


Instructions: you will do this problem set by modifying the source files in ps2.zip and submitting the program that results. The program that you submit must compile without any warnings. Programs that do not compile or compile with warnings will receive an automatic zero.

Updated Instructions: Beginning with this project, you will use the SML Compilation Manager (CM) to compile all of your files together. It will also let you get at some of the libraries that were previously inaccessible. It's actually very easy, but the directions are a bit complicated at first. There are several things to do:

Read First

This problem set is considerably more difficult than the first one.  Start early and you will finish.  Many problems on this problem set are not simple coding exercises but require a great deal of thought.  We are looking for concise, clear solutions. We encourage you to think about the problems well before the deadline and come to consulting/office hours with your questions.  Good luck and enjoy.

Part 1: Lists of Lists

First, a warm-up exercise:

  1. The mathematical description of many physical phenomena requires the use of a construct called the dot product.  The dot product of two real lists of equal length, [x1,...,xn] and [y1,...,yn] is defined as the sum

    x1*y1 + x2*y2 + ... + xn*yn
    We define the dot product of [] and [] to be 0.  Write a function dotProduct of type real list*real list->real that returns the dot product of the given two lists of reals.  The function should raise Fail if the given dot product does not exist, for example if the two lists are of unequal length.

Now onto the real stuff.

Background: A matrix is a two-dimensional array of data. Matrices whose numbers of rows and columns are equal are said to be square.  We will represent a matrix in SML by nested lists of elements.  That is,
type 'a matrix = 'a list list

For the moment we will consider only real matrices.  One possible matrix is

[[1.0, 2.0, 3.0],
 [4.0, 5.0, 6.0]]

For a given matrix A, the element at row i column j is denoted as Ai,j.

The transpose of a matrix A is defined as the matrix B where Ai,j=Bj,i. So the transpose of the above matrix is:

[[1.0, 4.0],
 [2.0, 5.0],
 [3.0, 6.0]]
It is simply the matrix with its rows and columns switched (hence the name transpose).

The product of two real matrices A and B is defined as the matrix C such that
  Ci,j = the dot product of the ith row of A and the jth column of B.
Clearly the matrix product is defined only if the number of columns in A is equal to the number of rows in B.

So for example,

[[1.0, 1.0, 1.0],     [[1.0, 0.0, 1.0],
 [2.0, 2.0, 2.0]]  *   [1.0, 0.0, 2.0],
                       [1.0, 0.0, 3.0]]

    [[1.0+1.0+1.0, 0.0+0.0+0.0, 1.0+2.0+3.0],
  =  [2.0+2.0+2.0, 0.0+0.0+0.0, 2.0+4.0+6.0]]

     [[3.0, 0.0, 6.0],
  =   [6.0, 0.0, 12.0]]
The real identity matrix I of n rows and n columns is the square matrix with 1's along the diagonal stretching from the upper left to lower right corner (Ii,i=1.0 for all i) and 0.0 everywhere else.

For more information and examples of matrix math please see a textbook on linear algebra or come to office hours.

First, let's write some functions for real matrices:

  1. Write a function identity of type int->real matrix that for a given integer n returns the real identity matrix of n rows and n columns.

  2. Write a function transpose of type real matrix->real matrix that returns the transpose of a given real matrix.  Hint: Think of the rows as columns and try to "merge" the columns together.

  3. Write a function multiply of type real matrix*real matrix->real matrix that returns the product of the given real matrices. You may assume that the sizes of the matrices are compatible. To receive full credit, your solution must take time proportional to n3 where n is the max of the number of rows and columns.  Solutions that use List.nth are not likely to reach this performance bound.


Part 2: Twisted Polymorphic Matrix Fun

Real matrices are nice, but now let's generalize to matrices of strings, bools, maybe even matrices!  Again our type is
type 'a matrix = 'a list list

We will first generalize matrix multiplication.  Real matrix multiplication depended on two operations:  real multiplication and real addition.  So instead of using the real * and + operators, we will use whatever plus or times function is appropriate for the situation.  We can pass this into a generic multiply function to get polymorphic matrix multiplication.

Why is this useful?  For one example, consider boolean matrices.  You can think of a boolean matrix A as a graph.  Each node is represented by a row and column in A.  Two nodes i and j share an edge if Ai,j = true.  If we continually multiply A with itself, we will eventually get what is called the transitive closure of A, denoted A*.  An important fact about A* is that two nodes i and j are connected along some path of edges if A*i,j = true.

To multiply boolean matrices, we use boolean AND instead of * and boolean OR instead of +.  So,

[[true,  false],     [[false, true]
 [false, false]]  x   [true,  true]

  [[(true and false) or (false and true),   (true and true) or (false and true)]
=  [(false and false) or (false and true), (false and true) or (false and true)]]

  [[false,  true]
=  [false, false]]

Note that it is possible to compute a dot product between a row and column (as you need to do in matrix multiplication) by "adding" the pairwise products to a zero value.  So for example, we could have gotten the element in the first row and first column of the matrix above by computing

false or (true and false) or (false and true)

since false is the analog of 0.0 in a boolean matrix.  This will be helpful in generalizing matrix multiplication.

Boolean matrices also have other applications in logic gates.  For more information, see a textbook on discrete mathematics.  We can also multiply integer matrices and all kinds of other types.  As a thought exercise, you can think about multiplying two matrices where the elements themselves are matrices.

And now the exercises:

  1. Generalize some functions from Part 1 to have the following types:
  2. Write a function isEqual of type ('a*'a->bool)->'a matrix*'a matrix->bool  that given a function that determines element equality and two matrices, returns true if the two matrices are the same.

  3. Use the above code to concisely write the following functions:

Some standard functional programming exercises:

  1. Write a function map of type ('a->'b)->'a matrix->'b matrix that given a function f and a matrix m, returns a new matrix that is the same size as m but with f applied to each corresponding element.

  2. Write a function foldRow of type ('a*'b->'b)->'b->'a matrix->'b list that given a function f, a value acc, and the matrix of m rows and n columns:

    [[x1, ... , xn],
     [xn+1, ..., x2n],
     ...
     [xn(m-1)+1, ..., xmn]]
    computes a list, each of whose elements is the result of folding a single row:
    [ f(xn, f(xn-1, ...f(x2, f(x1, acc))...)),
      f(x2n, f(x2n-1, ...f(xn+2, f(xn+1, acc))...)),
      ...
      f(xmn, f(xmn-1, ...f(xn(m-1)+2, f(xn(m-1)+1, acc))...)) ]

  3. Use the above functions to write the following functions:


Part 3 - Datatypes and Pattern Matching

For this part we will consider prefix arithmetic expressions.  You may have learned about these in 211, but if you don't remember here is a quick refresher.

A prefix arithmetic expression is an expression where the operators precede the operands.  For this problem we will restrict ourselves to the following kinds of expressions:

Examples of prefix expressions of this form are +1 2, which is 1+2 in infix (standard mathematics notation), or *~2 5, which is ~2*5 in infix..

We will use the following datatype for prefix arithmetic expressions.

datatype exp = Int of int | Neg of exp | Plus of exp*exp | Minus of exp*exp | Times of exp*exp

Some examples of possible exp's include

Plus(Int 3, Int 4)    which stands for +3 4  (3+4 in infix)
Times(Neg (Int 2), Minus(Int 7, Int 5))    which stands for *~2-7 5   ( ~2*(7-5) in infix )

  1. Write a function eval of type exp->int which for a given expression returns the integer evaluation of that expression.  For example,

    eval(Int 5) = 5
    eval(Plus(Int 3, Int 4)) = 7
    eval(Times(Neg (Int 2), Minus(Int 7, Int 5))) = ~4

  2. This question is hard. Stop and think before you code.

    Write a function parse of type string list->exp which for a given list of tokens returns the expression corresponding to that list.  Here, tokens are any of "+", "-", "*", "~", or any positive integer number.  If the given list is invalid (does not form a legal expression) you should throw the InvalidExpression exception provided for you in the Part4 signature.  Note that a list that contains extra tokens is also considered to be invalid.

    Some examples:

    parse ["5"] = Int 5
    parse ["+", "3", "4"] = Plus(Int 3, Int 4)
    parse ["*", "~", "2", "-", "7", "5"] = Times(Neg (Int 2), Minus(Int 7, Int 5))
    parse ["*", "3", "4", "5"] = ERROR
    parse ["*", "~2", "4"] = ERROR
    parse ["~", "3", "*", "4"] = ERROR
    eval(parse ["-", "*", "5", "1", "4"]) = 1