Problem Set 2: Folding, Diff, and Symbolic Differentiation
Due Thursday, February 14, 23:59
Version: 3
Last modified: Saturday, February 7, 20:43
Changes:

Version 2: Added part 3, changed commas in part 2 examples to semicolons
Version 3: Added part 3, changed commas in part 2 examples to semicolons
Instructions
This problem set has three parts. You should
write your solution to each part in a separate file: warmup.ml
, diff.ml
,
and expression.ml
. To help get you started,
we have provided a stub file for each part on CMS. You should download and
edit these files. Once you have finished, submit your solution using CMS,
linked from the course website.
Note: Unless otherwise mentioned, you may not use the rec
keyword in this problem set. You may find the standard List module functions, namely fold_left, particularly useful. You should consider making your solution tailrecursive, as points will be taken off if your function stackoverflows with large input. Points will also be deducted for abuse of the @ operator. Side effects are not allowed.
Important notes about grading:
 Compile errors: All code you submit must compile.
Programs that do not compile will likely receive an
automatic zero. If you are having trouble getting your
assignment to compile, please visit consulting hours. If you run out
of time, it is better to comment out the parts that do not compile,
than hand in a more complete file that does not compile.
 Naming: We will be using an automatic grading
script, so it is crucial that you name your functions
and order their arguments according to the problem set instructions,
and that you place the functions in the correct files. Otherwise you
may not receive credit for a properly written function.
 Code style: Finally, please pay attention to style. Refer
to the CS 3110 style guide
and lecture notes. Ugly code that is
functionally correct may still lose points. Take the extra time to
think out the problems and find the most elegant solutions before
coding them up.
 Late assignments: Please carefully review the course
website's policy on late assignments, as all assignments handed
in after the deadline will be considered late. Verify on CMS that you
have submitted the correct version, before the deadline.
Submitting the incorrect version before the deadline and
realizing that you have done so after the deadline will be counted as a
late submission.
Part 1: WarmUp to Folding (10 pts)

Write a function simple_stats : float list > float * float * float * float * int
Given a float list, return a tuple of (min, max, mean, range, number of elements) from the input (range is the difference between the min and the max). For full credit, you must use only one list folding function, and all of the work must be done inside of this function.
Note: You may raise a suitable exception if the list is empty.
Part 2: Diff (45 pts)
 Write a function
prepend_all : 'a list list > 'a > 'a list list
that takes in a list of lists of any
type and an element of that type and returns all of the lists with that
element added to the front of the list. For example:
prepend_all [[2; 4; 5]; []; [3; 10]] 1 =
[1; 2; 4; 5]; [1]; [1; 3; 10]]
 Write a function
union_lists : 'a list > 'a list > 'a list
that takes in two lists of the same type
and returns a list which contains the union of the elements of the two
lists (i.e. there should be no duplicates in the list returned)
You may assume that the argument lists are already valid sets.
 Write a function
lcs : 'a list > 'a list > 'a list list
that takes two lists and computes
the longest common subsequence (LCS) of the elements in the two lists. This
function should return a list of lists, and which contains all possible
LCS in case of a tie. Note that a subsequence is not the same as a substring. If a sequence has some of its elements removed without the order of the remaining ones changed, then the resulting sequence is a subsequence of the original sequence. For example, [A; C] is a subsequence of [A; B; C; D]. The LCS of two lists is then the longest subsequence that is in both lists.
The general algorithm for doing computing the LCS is described below.
Consider two nonempty lists, lst_{1} and lst_{2}. Let
fin_{1} be the last element in lst_{1} and
start_{1} be the rest of lst_{1}, and fin_{2}
and start_{2} be the same for lst_{2}.
Say that we want to find the LCS of lst_{1} and lst_{2}. If
fin_{1} = fin_{2}, then this element must be in subsequence,
and the rest of the subsequence can be found by recursing on start_{1}
and start_{2}. If fin_{1} <> fin_{2}, then they
cannot both be in the LCS. Therefore,
we take the best out of three cases:
 fin_{1} is not in the subsequence, so we can find the
subsequence by recursing on start_{1} and lst_{2} and take
the longest subsequence of the two
 fin_{2} is not in the subsequence, so we can find the
subsequence by recursing on lst_{1} and start_{2} and take
the longest subsequence of the two
 Neither fin_{1} nor fin_{2} is in the subsequence, so we can
find the subsequence by recursing on start_{1} and start_{2} and
take the longest subsequence of the two
The recursive algorithm described above will work, but it will be
extremely slow because it runs in exponential time, and will end up repeating work a very large
number of times.
To fix this problem, we are going to use a technique called dynamic programming, and incrementally build up to the solution starting from the base case of finding the LCS of two empty lists.
In the picture below, the list [G; A; C] is lined up against the first column, and the list [A; G; C; A; T] is lined up across the first row. In this matrix,
finding the value at position (row, column) is the same as finding
the LCS between the lists represented by that
column and that row. For example, if we label the cell at the intersection of the two empty lists as (0, 0), then cell (2, 3) represents the LCS of [G; A] and [A; G; C]. This means that if we can fill the bottom right corner of the matrix, then we have found the LCS of [G; A; C] and [A; G; C; A; T].

Figure 1 Image from Wikipedia
Say we want to fill in cell (2, 4). Since fin_{1} = fin_{2}, we simply take the result in (1, 3) and add
A to all of them.
Suppose instead that we are trying to fill in cell (2, 3).
We can see that A and C are not the same. This is the fin_{1} <> fin_{2} case, and we use the best of the three cases listed above. The result of
start_{1} and lst_{2} is in the cell at (1, 3).
The result of lst_{1} and start_{2} is in the cell at
(2, 2). The result of start_{1} and start_{2} will be
included in one of those two cells already, so we do not have to
worry about it. We have two cases when comparing results:

If the length of LCS in one cell is longer than the length of LCS in the other cell, then we use the result from the first cell

If the length of LCS in both cells is equal, then we take the union of the results
We can run this algorithm by filling in the matrix
starting from the top left and going across each row. Notice how it is not necessary to store the entire matrix.
Since each row only depends on the row before, it is enough
to keep track of only the previous row and the row that you are
currently building.
If you would like to read more about the longest common subsequence,
you can check out Wikipedia. Example output to the problem shown in the table is given
below. Note that the order in which you give the best subsequences
does not matter.
lcs [A; G; C; A; T] [G; A; C] = [[A; C]; [G; C]; [G; A]]
 You may use the
rec
keyword for this problem.
Write a function diff : string > string > string list that takes the names of two files as strings and returns a list of strings representing
the changes between the two files. The first file should be the
original file, and the second should be the updated version. After loading the two files using the provided function, you can use the result from your lcs to produce the output (you may choose any of the subsequences in case of a tie). The output should be computed as follows:
 Any line that appears in both the original file and the LCS or in both the updated file and the LCS is unchanged, and should have no effect on the output.
 Any line that is in the original file but not in the
LCS must have been deleted. Deletions from the original file should start with " " prepended to the beginning of the line.
 Any line that is in the updated file but not in the LCS must have been added. Additions to the updated file should start with "+ " prepended to the beginning of the line.
 A change should be represented as a deletion and then an
addition. If multiple lines are changed, all of the deletions
should appear before all of the additions.
The relative ordering of lines must be consistent in the output. However, you can treat changes to multiple lines as deleted first then added second. The figure below shows an example of how the function works:

Figure 2: A diff Example
Each nongreen color group represents a "change", and the output always shows the deletion first and addition second.
As you test your code, you may come across cases where
the diff produces a valid output, but it could have been cleaner. For example, you can run your diff on these two inputs
programming programming
language using
ocaml programming
is language
awesome! ocaml
is
awesome!
and get this output:
"+ using"
"+ programming"
This is a bit bothersome, as it is clear that the first two lines in
the new file are the ones that were added, not the second and third.
We are not concerned about this, but there are some different algorithms
that do a better job of handling it.
If you are interested in learning more
you can find an example here.
Exceptional diff implementations may receive karma.
Part 3: Symbolic Differentiation (45 points)
In the Summer of 1958, John McCarthy (recipient of the Turing Award in
1971) made a major contribution to the field of programming
languages. With the objective of writing a program that performed
symbolic differentiation of algebraic expressions, he noticed that
some features that would have helped him to accomplish this task were
absent in the programming languages of that time. This led him to the
invention of LISP (published in Communications of the ACM in 1960) and
other ideas, such as list processing (the name Lisp derives from "List
Processing"), recursion and garbage collection, which are essential to
modern programming languages, including Java. Nowadays, symbolic
differentiation of algebraic expressions is a task that can be
conveniently accomplished with modern mathematical packages, such as
Mathematica and Maple.
The objective of this part is to build a language that can
differentiate and evaluate symbolically represented mathematical
expressions that are functions of a single variable x
. Symbolic
expressions consist of numbers, variables, and standard functions
(plus, minus, times, divide, sin, cos, etc). To get
you started, we have provided the datatype that defines the
abstract syntax for such expressions.
(* abstract syntax tree *)
(* Binary Operators *)
type bop = Add  Sub  Mul  Div  Pow
(* Unary Operators *)
type uop = Sin  Cos  Ln  Neg
type expr = Num of float
 Var
 BinOp of bop * expr * expr
 UnOp of uop * expr
This datatype provides an internal representation of an expression
that the user might type in. The constructor
Var
represents the single variable x
, and
UnOp (Ln, Var)
represents the natural logarithm of x.
Neg
refers to unary negation and is denoted by the symbol ~.
(the symbol  is only used for subtraction). The meaning of the others
should be clear.
Mathematical expressions can be constructed using the
constructors in the above datatype definition. For example, the expression
x^2 + sin(~x)
would be
represented internally as:
BinOp (Add, BinOp (Pow, Var, Num 2.0), UnOp (Sin, UnOp (Neg, Var)))
This represents a tree in which nodes are the type constructors
and the children of each node are an operator and the
arguments of that operator. Such a tree is called an abstract
syntax tree (or AST for short).
For your
convenience, we have provided a function parse_expr
which
translates a string in infix form (such as x^2 +
sin(~x)
) into a value of type expr
.
The parse_expr
function
parses according to the standard precedence of operations, so
5 + x * 8
will be read as
5 + (x * 8)
. We have also provided two functions
to_string
and to_string_smart
that print expressions in a readable form
using infix notation. The former produces a fullyparenthesized
expression so you can see how the parser parsed the expression,
whereas the latter omits unnecessary parentheses.
# let e = parse_expr "x^2 + sin(~x)";;
val e : Expression.expr =
BinOp (Add, BinOp (Pow, Var, Num 2.), UnOp (Sin, UnOp (Neg, Var)))
# to_string e;;
 : string = "((x^2.)+(sin((~(x)))))"
# to_string_smart e;;
 : string = "x^2.+sin(~(x))"
For full credit, only use rec
in your implementation of fold_expr.
Do not use rec
in evaluate and derivative.
 Write a function
fold_expr : (expr > 'a) > (uop > 'a
> 'a) > (bop > 'a > 'a > 'a) > expr > 'a
such that
fold_expr atom unary binary expr
performs a postorder traversal of
expr
. During the traversal, it applies atom :
expr > 'a
to every leaf node (i.e. Var
or
Num
). It applies unary : uop > 'a > 'a
to every unary expression. The first argument to unary
is the unary operator and the second is the result of applying
fold_expr
recursively to that expression's operand. Finally,
fold_expr
applies binary : bop > 'a > 'a >
'a
to every binary expression. The arguments to
binary
are the binary operator and the results of
recursively applying fold_expr
to the left and right operands.
For example, here is a function that uses fold_expr
to
determine whether the given expression contains a variable.
let contains_var : expr > bool =
let atom = function
 Var > true
 _ > false in
let unary op x = x in
let binary op l r = l  r in
fold_expr atom unary binary
You may use rec in your implementation of fold_expr without penalty.
 Using
fold_expr
, write a function eval : float
> expr > float
such that eval x e
evaluates
e
, setting Var
equal to the value
x
. For full credit, do not use rec in evaluate.
Example: eval 2.0 (BinOp (Add, Num 2.0, Var)) = 4.0
.
 Using
fold_expr
, write a function derivative :
expr > expr
that returns the derivative of an expression with
respect to Var
. The expression it returns need not be
simplified. For full credit, do not use rec in derivative.
Example: derivative (BinOp (Pow, Var, Num 2.0))
= BinOp (Mul, Num 2., BinOp (Mul, Num 1., BinOp (Pow, Var, BinOp (Sub,
Num 2., Num 1.))))
When implementing this function, recall the
chain rule from your freshman calculus course:
This rule tells how to form the derivative of the composition of two functions if
we know the derivatives of the two functions.
With the help of this rule, we can write the derivatives
for all the operators in our language:
There are two cases provided for calculating the derivative of f(x) ^ g(x)
,
one for where g(x) = h
does not contain any variables, and
one for the general case.
The first is a special case of the second, but it is useful to treat
them separately, because when the first case applies, the second case
produces
unnecessarily complicated expressions.