Solving recurrences

Note: this page uses the following special characters: Greek capital letter theta: (Θ), Greek capital letter omega (Ω), minus sign (−). If these characters do not appear correctly, your browser is not able to fully handle HTML 4.0, and some of the following text will likely not have the correct appearance.

In last lecture we saw how to solve recurrences, with the example of a simple multiplication function using only additions, and running in time O(log n). Today we will see more examples of those, by proving the complexity of mergesort, as well as the complexity of a function calculating Fibonacci numbers.

Merge sort

Implementation of merge sort

let rec split (l: 'a list) : 'a list * 'a list = 
   match l with
    [] -> [],[]
  | [x] -> [x],[]
  | x::y::t -> let l,r=split t in x::l,y::r

(* A simpler way to write split. Recall the definition of
   List.fold_right. What is the asymptotic performance of 
   List.fold_right f lst acc0 where f is an O(1) function
   and lst is an n-element list? O(n). *)
let split' (l: 'a list) : 'a list * 'a list = 
  List.fold_right (fun x (left,right) -> (x::right,left)) l ([],[])
	
let rec merge (left: 'a list) (right: 'a list): 'a list =
  match (left, right) with
    ([],_) -> right
  | (_,[]) -> left
  | (x::rest_left, y::rest_right) -> 
      if x > y then y::(merge left rest_right)
               else x::(merge rest_left right)

(* merge_sort l is a list containing the same elements as xs but in
 * ascending (nondescending) sorted order.  *)
let rec merge_sort (l: 'a list) : 'a list =
(* Implementation: lists of size 0 or 1 are already sorted. Otherwise,
 * split the list into two lists of equal size, recursively sort
 * them, and then merge the two lists back together. *)
  match l with
   ([]|[_]) -> l
  | _ -> let (left, right) = split l in 
           merge (merge_sort left) (merge_sort right)

Merge sort asymptotic timing analysis

Now let's show that merge_sort is not only a correct but also an efficient algorithm for sorting lists of numbers. We start by observing without proof that the performance of the split function is linear in the size of the input list. This can be shown by the same approach we will take for merge, so let's just look at merge instead.

The merge function too is linear-time—that is, O(n)—in the total length of the two input lists. We will first find a recurrence relation for the execution time. Suppose the total length of the input lists is zero or one. Then the function must execute one of the two O(1) arms of the case expression. These take at most some time c₀to execute. So we have

T(0) = c₀
T(1) = c₀

Now, consider lists of total length n. The recursive call is on lists of total length n−1, so we have

T(n) = T(n−1) + c₁

where c₁ is an constant upper bound on the time required to execute the if statement and the operator :: (which takes constant time for usual implementations of lists). This gives us a recurrence relation to solve for T. We can apply the iterative method to solve the recurrence relation by expanding out the recurrence relation inequalities for the first few steps.

T(0) = c₀
T(1) = c₀T(2) = T(1) + c₁= c₀+ c₁T(3) = T(2) + c₁= c₀+ 2c₁T(4) = T(3) + c₁= c₀+ 3c_{1

...}T(n) = T(n−1) + c₁= c₀+ (n−1)c₁= (c₀- c₁)+ c₁n

We notice a pattern which the last line captures. This pattern can be proved more rigorously by induction: let us prove by induction that for n >=0, T(n)=(c₀- c₁)+ c₁n. For n=0, the result is true (proved above), and if it is true for n-1, it is true for n using the last line above.

Recall that T(n) is O(n) if for all n greater than some n₀, we can find a constant k such that T(n) < kn. For n at least 1, this is easily satisfied by setting k = c₀ + 2c₁. Or we can just remember that any first-degree polynomial is O(n) and also Θ(n). An even simpler way to find the right bound is to observe that the choice of constants c₀ and c₁ doesn't matter; if we plug in 1 for both of them we get T(1) = 1, T(2)=2, T(3)=3, etc., which is clearly O(n).

Now let's consider the merge_sort function itself. Again, for zero- and one-element lists we compute in constant time. For n-element lists we make two recursive calls, but to sublists that are about half the size, and calls to split and merge that each take Θ(n) time. For simplicity we'll pretend that the sublists are exactly half the size. The recurrence relation we obtain has this form:

T(0) = c₀
T(1) = c₀T(n) = 2 T(n/2) + c₁n + c₂n + c₃

Let's use the iterative method to figure out the running time of merge_sort. We know that any solution must work for arbitrary constants c₀ and c₄, so again we replace them both with 1 to keep things simple. That leaves us with the following recurrence equations to work with:

T(1) = 1
T(n) = 2 T(n/2) + n

Starting with the iterative method, we can start expanding the time equation until we notice a pattern:

T(n) = 2T(n/2) + n
     = 2(2T(n/4) + n/2) + n
     = 4T(n/4) + n + n
     = 4(2T(n/8) + n/4) + n + n
     = 8T(n/8) + n + n + n
     = nT(n/n) + n + ... + n + n + n
     = n + n + ... + n + n + n

Counting the number of repetitions of n in the sum at the end, we see that there are lg n + 1 of them. Thus the running time is n(lg n + 1) = n lg n + n. We observe that n lg n + n < n lg n + n lg n = 2n lg n for n>0, so the running time is O(n lg n). So now we've done the analysis by using the iterative method, let's use strong induction to verify that the bound is correct.

Merge sort analysis using strong induction

Property P(n) to prove:

n ≥ 1 ⇒ T(n) = n lg n + n

Proof by strong (course-of-values) induction on n. For arbitrary n, show P(n) is true assuming the induction hypothesis T(m) = m lg m + m for all m<n.

Case n = 0: vacuously true

Case n = 1: T(1) = 1 = 1 lg 1 + 1

Case n > 1:

Induction Hypothesis:

Proof:

T(n) = 2 T(n/2) + n
    = (n/2) lg (n/2) + 2(n/2) + n              (by induction hypothesis)

    = n lg (n/2) + 2n
    = n lg n − 1) 1) + 2n

    = n lg n + n

Since n lg n + n is Θ(n lg n), we have shown that merge sort is Θ(n lg n).

The Fibonacci numbers

The Fibonacci numbers, written F(n), are defined by F(0)=0, F(1)=1, and for n>1, F(n)=F(n-1)+F(n-2). The first few Fibonacci numbers are 0,1,1,2,3,5,8,13,21,34,55,89,...

A first implementation and its complexity

We would like to write a function that would calculate the n-th Fibonacci number. A first implementation would be:

(* requires n>=0 *)
let rec fibo=function
  0 -> 0
| 1 -> 1
| n -> (fibo (n-1))+(fibo (n-2))

This function follows directly from the definition of Fibonacci numbers, thus its correctness. Now, if we try running it in OCaml on say 100, it takes forever to complete. Let us see why by analyzing its asymptotic time.

The asymptotic time taken for n=0 and n=1 is constant, let us call it c₀, so that T(0)=T(1)=c₀. To calculate T(n) we make two recursive call, so that T(n)=T(n-1)+T(n-2).

In mathematics, it can be shown that a solution of this recurrence relation is of the form T(n)=a₁*r₁ⁿ+a₂*r₂ⁿ, where r₁ and r₂ are the solutions of the equation r²=r+1. We get r₁=(1+sqrt(5))/2 and r₂=(1-sqrt(5))/2. Then with T(0)=T(1)=c₀, we get a₁+a₂=a₁r₁+a₂r₂=c₀, leading to a₁=c₀r₁/sqrt(5) and a₂=-c₀r₂/sqrt(5).

We can see that r₂<1, therefore r₂ⁿ is o(1). Therefore T(n) is Θ(r₁ⁿ), with r₁=(1+sqrt(5))/2. The algorithm thus takes an exponential time to complete.

A better implementation

(* requires n>=0 *)
let fibo' n=
 if(n=0) then 0 else 
   (* a is F(i-2) and b is F(i-1) *)
   let a=ref 0 and b=ref 1 and c=ref 0 in
   for i=2 to n do
      c:= !b;
      b:= !a + !b; 
      a:= !c;
   done; !b

This implementation is clearly linear: each loop takes a constant time to complete, there are order of n loops. Like before, the correctness can be obtained by a recurrence on n (by stating that at each loop, a=F(i-2) and b=F(i-1)).