Problem Set 2
Due April 13, 11:59:59 pm.
In this problem set, we will investigate the use of Hidden Markov Models (HMMs)
to recover the most likely sequence of hidden states, given that the output and
transition probabilities are known. More specifically, consider K
coins with probabilities p0, p2, ...,
pK-1 of coming up heads when flipped. You know each of these
probabilities and you also know that there is a probability q that the
coin being flipped will change at any time step. The coin that starts the
sequence can be any coin with uniform probability. If the coin being flipped
changes, it does so to one of the other coins with uniform probability. The
task is to recover the most likely sequence of hidden states (coins being
flipped). For example, if there were two coins, with probabilities
p1 = 0.99 and p2 = 0.01 and q =
0.1 and you observe the sequence heads, heads, tails, tails, then the
most likely sequence of hidden states is 0, 0, 1, 1.
Part 0
Download the structure and partial HMM Code.
Part 1
Use the Viterbi
algorithm to find the most likely sequence of states. Put this code in the
structure HMM_DP (don't change the function signature). The parameters to your
function are the list of heads probabilities, the list of observed flip
outcomes (true for heads, false for tails), and the probability of not changing
to a different coin at each step (1-q). You need not worry about
cases where probabilties are outside the range [0,1] (but be careful about the
case where they are exactly 0 or 1). For instance, to recover the states for
the case given in the introduction, you would call
recoverCoins([0.99,0.01],[true,true,false,false],0.9), which should return
[0,0,1,1].
Part 2
To avoid overflow in part 1, you should have added negative logarithms instead
of multiplying probabilities (if not you may want to go back). This suggests an
alternative approach to solving the problem. We can think of every pair
(position in flip sequence, current coin) as a node in a graph. There is an
edge from a node (pos,x) to node (pos+1,y) whose weight is equal to
-log(probability of coin y coming up as it did in position pos) - log(probability of going
from coin x to coin y). Thinking about things this way, recovering the
state sequence is the same as finding the shortest path from some node (0,x) to
some other node (N,y) where N is the number of coin flips. With this in mind,
use Dijkstra's shortest path algorithm to find the state sequence.
To help you in this part, we have provided code for some of the key operations.
The following functions are already written:
(*Takes the length of the flip sequence and number of coins as input*)
empty:int*int->hmmheap
(* Removes and return the element on top of the heap
* (the next node to expand)*)
removeFirst: hmmheap->int*int*real
(* This does everything else. Given a hmmheap, a position, a state, and a
* distance, this function inserts the (position,state) node into the heap
* with the correct distance if the node has never been inserted. If it is
* already in the heap, the function updates its distance to the minimum of
* the old value and new value. Elements which have already been inserted
* and removed are ignored in calls to reduce. The function returns true
* if the call changed the priority queue, and false otherwise*)
reduce:hmmheap*int*int*real->hmmheap
With the code provided, you should be able to implement Dijkstras by a sequence
of removeFirst and reduce operations: at each step you removeFirst and then
reduce all the removed node's neighbors. Since you are trying to find a
shortest path from any node (0,x) to any node (N,y), you should start by
inserting (by calling reduce) each node (0,x) with distance 0. As soon as you
find some node (N,y), you should stop and recover the sequence of states along
the shortest path. By using the return value of reduce, you can store (in a 2D
array) the previous nodes along the shortest path, and use these for
backtracking once you reach a node (N,y). It is important that you stop as soon
as a removeFirst call gives you (N,y), so you don't do any extra work. It is
important that you wait until removeFirst gives you (N,y) -- don't stop as soon
as you insert (N,y).
Part 3
When you run the above two implementations (which should give the same answer in
most cases -- they may be different in the event of ties) which one is faster?
Does the speed depend on the parameters? Explain which one you think is faster
and why, or explain when one is faster and when the other is faster. This does
not need to be particularly formal. You will need to use fairly long sequences
to get accurate timing.
Part 4
Consider the following poorly documented function foo:
fun foo(l,rr,f,g) =
case (l,rr) of
(h1::t1,h2::t2) =>
if(f(h2,h1)) then foo(g(h2,h1)::t1,t2,f,g)
else foo(t1,h1::h2::t2,f,g)
| (h::t,[]) => foo(t,[h],f,g)
| _ => List.rev(rr)
Assuming that the functions f and g take constant time, what is the worst case
runtime of calling foo(ls,[],f,g) in terms of N, the length of ls.
Give your answer in Big-O notation, along with an analysis proving you are
correct.
Part 5
Consider the following code:
let
val x = 5
fun f(y) =
let
fun f(x) = x
in
f(f)
end
in
f 5
end
Using the environment model, draw the state of the environment when the number
of frames is maximal. That is, in the environment model we typically create
frames for some things, and get rid of them for something else. Show the state
of the environment at the point right before any frames are removed. Indicate
where the final return of f 5 comes from in your diagram. This part
may be written by hand and turned in in class.