You have already seen how an asymptotic analysis can give us some indications on how efficient a procedure runs. However, up to this point, you have determined the worst-case running time iteratively: taking the recurrence relation, solving for specific values of n, and looking for a pattern. Is there a better way?
We can use induction to prove that a procedure runs in the time we claim it does. Doing such a proof has the advantages of give a more definitive answer and not requiring tedious computation.
Recall the code you saw for insertion sort:
fun insert(e:int, l:int list):int list =
case l of
[] => [e]
| x::xs => if (e < x) then e::l else x::(insert (e,xs))
fun isort' (l1: int list, l2:int list) =
case l1 of
[] => l2
| x::xs => isort'(xs, insert(x, l2))
fun isort(l:int list) = isort'(l, [])
First, we want to prove that the running time of
insert is O(n). First, let us consider the
recurrence relation:
| T(1) = c1 |
| T(n) = T(n-1) + c2 |
We will assume that both c1 and c2 are 1. We will now prove the running time using induction:
insert(e,l) is linear, i.e., T(n) ≤ n, where
the length of l is n. Proof by induction on
n| T(n+1) | = | T(n) + 1 | |
| ≤ | n + 1 | By Induction Hypothesis |
insert
is O(n).Now, we need the recurrence relation for isort. This
will be use the relation we have for our funciton insert
| T(1) = c1 |
| T(n) = T(n-1) + Tinsert(n) |
We will again assume that both c1 is 1. We will now prove the running time using induction:
isort(l) is quadratic, i.e., T(n) ≤ n2, where
the length of l is n. Proof by induction on
n| T(n+1) | = | T(n) + Tinsert(n) | |
| ≤ | n2 + Tinsert(n+1) | By Induction Hypothesis | |
| ≤ | n2 + n + 1 | By proof above | |
| ≤ | n2 + 2n + 1 | ||
| = | (n+1)2 |
isort
is O(n2).Next, we look at a slightly harder example. Consider the Merge Sort, which divides a list of length n into two lists of length n/2 and recursively sorts them. At the base is a list of length 1, which is inherently sorted. The algorithm joins two sorted lists of length n/2 into a single sorted list of length n. The SML code for a merge sort could be:
fun split(lst:int list):int list * int list =
fold (fn (x, (left,right)) => (right,x::left)) ([],[]) lst
fun merge(left:int list,right:int list):int list =
case (left,right) of
(nil,_) => right
| (_,nil) => left
| (x::xs,y::ys) => if x < y then x::merge(xs,right) else y::merge(left,ys)
fun msort(lst:int list):int list =
case lst of
[] => []
| [x] => [x]
| _ => let val (left,right) = split lst
val lsorted = msort left
val rsorted = msort right
in
merge(lsorted,rsorted)
end
We need to know the recurrence relations and running times for
split and merge. You should be able to
prove that the running times of both are O(n), where
merge(l,m) is merging two lists of length n/2. We
will now use this information to prove the running time of
msort is n log(n) + n = O(n log n). First, we must
determine the recurrence relation:
| T(1) = c1 |
| T(n) = Tsplit(n) + 2*T(n/2) + Tmerge(n) |
Note that we make an assumption that split splits a
list of length n into two lists each of length n/2. To
be complete, one would have to prove that this is true. For now, we
will just prove that the running time is O(n log n):
msort(l) is runs in n log n time, i.e., T(n)
≤ n log n + n, where
the length of l is n. Proof by induction on
n| T(n+1) | = | Tsplit(n+1) + 2*T((n+1)/2) + Tmerge(n+1) | |
| ≤ | Tsplit + ((n+1) (log (n+1)/2)) + Tmerge(n+1) | By Induction Hypothesis | |
| ≤ | 2((n+1)/2 (log (n+1)/2) + (n+1)/2) + (n+1) | where Tmerge + Tsort = (n+1) | |
| ≤ | (n+1) log (n+1) + (n+1) |
msort
is O(n log n).