You have already seen how an asymptotic analysis can give us some indications on how efficient a procedure runs. However, up to this point, you have determined the worst-case running time iteratively: taking the recurrence relation, solving for specific values of n, and looking for a pattern. Is there a better way?
We can use induction to prove that a procedure runs in the time we claim it does. Doing such a proof has the advantages of give a more definitive answer and not requiring tedious computation.
Recall the code you saw for insertion sort:
fun insert(e:int, l:int list):int list = case l of [] => [e] | x::xs => if (e < x) then e::l else x::(insert (e,xs)) fun isort' (l1: int list, l2:int list) = case l1 of [] => l2 | x::xs => isort'(xs, insert(x, l2)) fun isort(l:int list) = isort'(l, [])
First, we want to prove that the running time of
insert
is O(n). First, let us consider the
recurrence relation:
T(1) = c1 |
T(n) = T(n-1) + c2 |
We will assume that both c1 and c2 are 1. We will now prove the running time using induction:
insert(e,l)
is linear, i.e., T(n) ≤ n, where
the length of l
is n. Proof by induction on
nT(n+1) | = | T(n) + 1 | |
≤ | n + 1 | By Induction Hypothesis |
insert
is O(n).Now, we need the recurrence relation for isort
. This
will be use the relation we have for our funciton insert
T(1) = c1 |
T(n) = T(n-1) + Tinsert(n) |
We will again assume that both c1 is 1. We will now prove the running time using induction:
isort(l)
is quadratic, i.e., T(n) ≤ n2, where
the length of l
is n. Proof by induction on
nT(n+1) | = | T(n) + Tinsert(n) | |
≤ | n2 + Tinsert(n+1) | By Induction Hypothesis | |
≤ | n2 + n + 1 | By proof above | |
≤ | n2 + 2n + 1 | ||
= | (n+1)2 |
isort
is O(n2).Next, we look at a slightly harder example. Consider the Merge Sort, which divides a list of length n into two lists of length n/2 and recursively sorts them. At the base is a list of length 1, which is inherently sorted. The algorithm joins two sorted lists of length n/2 into a single sorted list of length n. The SML code for a merge sort could be:
fun split(lst:int list):int list * int list = fold (fn (x, (left,right)) => (right,x::left)) ([],[]) lst fun merge(left:int list,right:int list):int list = case (left,right) of (nil,_) => right | (_,nil) => left | (x::xs,y::ys) => if x < y then x::merge(xs,right) else y::merge(left,ys) fun msort(lst:int list):int list = case lst of [] => [] | [x] => [x] | _ => let val (left,right) = split lst val lsorted = msort left val rsorted = msort right in merge(lsorted,rsorted) end
We need to know the recurrence relations and running times for
split
and merge
. You should be able to
prove that the running times of both are O(n), where
merge(l,m)
is merging two lists of length n/2. We
will now use this information to prove the running time of
msort
is n log(n) + n = O(n log n). First, we must
determine the recurrence relation:
T(1) = c1 |
T(n) = Tsplit(n) + 2*T(n/2) + Tmerge(n) |
Note that we make an assumption that split
splits a
list of length n into two lists each of length n/2. To
be complete, one would have to prove that this is true. For now, we
will just prove that the running time is O(n log n):
msort(l)
is runs in n log n time, i.e., T(n)
≤ n log n + n, where
the length of l
is n. Proof by induction on
nT(n+1) | = | Tsplit(n+1) + 2*T((n+1)/2) + Tmerge(n+1) | |
≤ | Tsplit + ((n+1) (log (n+1)/2)) + Tmerge(n+1) | By Induction Hypothesis | |
≤ | 2((n+1)/2 (log (n+1)/2) + (n+1)/2) + (n+1) | where Tmerge + Tsort = (n+1) | |
≤ | (n+1) log (n+1) + (n+1) |
msort
is O(n log n).