Analyzing running time with recurrence relations

**Note:** this page uses the following special characters: Greek capital letter theta: (Θ),
Greek capital letter omega (Ω), minus sign (−). If these characters do not appear correctly,
your browser is not able to fully handle HTML 4.0, and some of the following text will likely not have the correct appearance.

While asymptotic complexity has its limitations, it is still a useful tool for thinking about the performance of programs. We can use asymptotic complexity to express and to analyze the performance of SML functions. The use of big-O notation simplifies our task. We assume that the primitive operations of our language, such as arithmetic operations and pattern matching, all take no more than a certain constant time. And all reductions performed during evaluation similarly take constant time.

This assumption may be surprising if you think about the substitution
work that seems to be required by the substitution model of evaluation, but
the SML implementation avoids doing the work of substitution, instead keeping
track of all the in-scope substitutions in a separate **environment** in
which variables are looked up when needed.

Now, consider the following multiplication routine:

fun times1 (a:int, b:int):int = if (b = 0) then 0 else a + times1(a,b-1)

What is the order of growth of the time required by `times1` as a
function of * n*, where * n*
is the magnitude of the parameter `b`?
Note that the "size" of a number can be measured either in terms of its
magnitude or in terms of the number of digits (the space it takes to write the
number down). Often the number of digits is used, but here we use the magnitude.
Note that it takes only about *log _{10} x* digits to write down a
number of magnitude

We assume that all the primitive operations in the `times1` function
`if`, *+*, =, and `-`) and the overhead for function
calls take constant time. Thus if *n*=0, the routine takes constant time.
If *n>0*, the time taken on an input of magnitude * n*
is constant
time plus the time taken by the recursive call on * n*`-``1`. In other words, there are constants *c*_{1} and
*c*_{2} such
that *T*(*n*) satisfies

T(n) = T(n)-1 +
c_{1} |
for n > 0 |

T(0) = c_{2} |

This is called a **recurrence relation**. It simply states that the time
to multiply a number *a* by another number *b* of size *n > 0*
is the time required to multiply *a* by a number of size *n -1* plus a
constant amount of work (the primitive operations performed).

This recurrence relation has a unique closed form solution, namely

*T*(*n*)* = c*_{2}* + c*_{1}*n*

which is *O*(*n*), so the algorithm is linear in the magnitude of `b`

. One can obtain this equation by generalizing from small values
of * n*, then prove that it is indeed a solution to the recurrence relation
by induction on * n*.

Now consider the following procedure for multiplying two numbers:

funtimes2(a:int, b:int):int =if(b = 0)then0elseifeven(b)thentimes2(double(a), half(b))elsea + times2(a, b-1)

Again we want an expression for the running time in terms of *n*, the
magnitude of the parameter `b`. We assume that double and half
operations are constant time (these could be done in constant time using
arithmetic shift) as well as the standard primitives. The recurrence relation
for this problem is more complicated than the previous one:

T(n) = T(n-1) + c_{1} |
if n > 0 and n is odd |

T(n) = T(n/2) + c_{2} |
if n > 0 and n is even |

T(0) = c_{3} |

We somehow need to figure out how often the first versus the second branch of
this recurrence relation will be taken. It's easy if *n* is a power of two, i.e. if *n
= 2 ^{m}* for some integer

T(n) = T(n/2) + c_{2} |
if n > 0 and n is a power of 2 |

T(0) = c_{3} |

or

T(n) = T(n/2) + c_{2} |
for n > 0 and n is a power of 2 |

T(0) = c_{3} |

for some constants * c*_{2} and * c*_{3}. For powers of
2, the closed form solution of this is:

*T(n) = c*_{3}* + c*_{2}* log _{2} n*

which is *O(log n)*.

What if *n* is not a power of 2? The running time is still *O(log n)*
even in this more general case. Intuitively, this is because if *n* is odd,
then *n -1* is even, so on the next recursive call the input will be halved.
Thus the input is halved at least once in every two recursive calls, which is
all you need to get

A good way to handle this formally is to *charge to the cost of a call to* `times2`
*on an odd input the cost of the recursive call on an even input that must immediately
follow it*. We reason as follows: on an even input *n*, the cost is the
cost of the recursive call on *n/2* plus a constant, or

*T(n) = T(n/2) + c*_{2}

as before. On an odd input *n*, we recursively call the procedure on *n-1*,
which is even, so we immediately call the procedure again on *(n-1)/2*.
Thus the total cost on an odd input is the cost of the recursive call on *(n-1)/2*
plus a constant. In this case we get

*T(n) = T((n-1)/2) + c*_{1}*+ c*_{2}

In either case,

*T(n) ≤ T(n/2) + **(c*_{1}* + c*_{2})

whose solution is still *O(log n)*. This approach is more or less the
same as explicitly unwinding the `else`

clause that handles odd inputs:

funtimes2(a:int, b:int):int =if(b = 0)then0elseifeven(b)thentimes2(double(a), half(b))elsea + times2(double(a), half(b-1))

then analyzing the rewritten program, without actually doing the rewriting.

Charging one operation to another (bounding the number of times one thing can happen by the number of times that another thing happens) is a common technique for analyzing the running time of complicated algorithms.

Order notation is a useful tool, and should not be thought of as being just a
theoretical exercise. For example, the practical difference in running times
between the logarithmic `times1` and the linear `times2` is
noticeable even for moderate values of *n*.

The key points are:

- We can use the asymptotic growth rates of functions (as
*n*gets large) to bound the resources required by a given algorithm and to compare the relative efficiency of different algorithms. - Big-O notation provides a way of expressing rough bounds on the resources required in a form that is meaningful yet easy to work with.
- Recurrence relations can be used to express the running times of recursive programs, and can often be solved for a closed form expression of the running time.

Let's take a look at a useful algorithm in more detail and show that it is
not only correct but that its worst-case performance is *O*(*n* lg *n*).
The algorithm we'll look at is **merge sort**, a recursive algorithm for
sorting a list of items. Merge sort is an example of a **divide-and-conquer**
algorithm. It sorts a list by dividing it into two smaller sublists, recursively
sorting the sublists, and then merging the two sorted lists together to produce
the final result. Merging two lists is pretty simple if they themselves are already sorted.
To prove the correctness and run time of merge sort we will want a stronger
proof technique: **strong induction**.

Strong induction has the same 5 steps as ordinary induction, but the induction hypothesis is a little different:

- State the proposition to be proved in terms of
*P*(*n*) - Base case: show
*P*(*n*_{0}) is true - Induction hypothesis: Assume that
*P*(*m*) is true for all*n*_{0}≤*m*≤*n*. This is different from ordinary induction where we only get to assume that*P*(*m*) is true for*m*=*n*. - Induction step: Using the induction hypothesis, prove
*P*(*n*+1) is true. - Conclusion:
*P*(*n*) is true for all*n*≥*n*_{0}

It is often easier to prove asymptotic complexity bounds using strong
induction than it is using ordinary induction, because you have a stronger
induction hypothesis to work with when trying to prove *P*(*n*+1).

(* split(xs) is a pair (ys,zs) where half (rounding up) of * the elements of xs are found in ys and the rest are in zs. *) fun split (xs: int list): int list * int list = let fun loop(xs:int list, left:int list, right:int list) : int list * int list = case xs of nil => (left, right) | x::nil => (x::left, right) | x::y::rest => loop(rest, x::left, y::right) in loop(xs, [], []) end (* A simpler way to write split. Recall the definition of foldl. What is the asymptotic performance of foldl f lst0 lst where f is an O(1) function and lst is an n-element list? O(n). *) fun split'(xs:int list) : int list * int list = foldl (fn (x, (left,right)) => (x::right,left)) ([],[]) xs (* merge(left,right) is a sorted list (in ascending order) * containing all the elements of left and right. * Requires: left and right are sorted lists *) fun merge (left: int list, right: int list): int list = case (left, right) of (nil,_) => right | (_,nil) => left | (x::left_tail, y::right_tail) => (if x > y then y::(merge(left, right_tail)) else x::(merge(left_tail, right))))

How do we know that `merge`

works? By induction on the sum of the
length of the two input lists (i.e., `length(left)+length(right)`

).
Clearly if that minimum length is zero, the function works because one of the
first two cases are used and they are trivially correct. What about the general
case? We are trying to show that merge works on lists `left`

and `right`

whose total length is *n*+1, and we are
allowed to assume that it works on lists `left`

and `right`

whose total length is *n* or less. If one of
the two lists is empty the function works. What if both lists are non-empty? By
the precondition (**requires** clause) we know that `x`

is the
smallest element of `left`

and `y`

the smallest element of
`right`

, and that `rest_left`

and `rest_right`

are sorted lists. Our inductive hypothesis lets us assume that `merge`

works correctly in the recursive calls because the total length of the two lists
is smaller than the total length of left and right, and the precondition of
merge in the recursive calls is satisfied (it is being applied to sorted lists).
If the **then** branch executes, `y`

must be smaller than any
element in either list; therefore, `y::(merge(left, rest_right))`

is
a sorted list. Conversely, in the **else** branch `x`

is smaller
than or equal to any element in either list; therefore ```
x::(merge(rest_left,
right))
```

is also a sorted list. And we can see that `merge`

doesn't "lose" any elements of `left`

or `right`

assuming that the recursive calls don't either.

Now we can write the merge-sort function itself. Note how we explicitly
separate the *specification* of the function from the description of the
algorithm that implements it. With `merge`

and `split`

specified as above, we don't really need even this much description of how `merge_sort`

works.

(* merge_sort(xs) is a list containing the same elements as xs but in * ascending (nondescending) sorted order. * * Implementation: lists of size 0 or 1 are already sorted. Otherwise, * split the list into two lists of equal size, recursively sort * them, and then merge the two lists back together. *) fun merge_sort (xs: int list) : int list = case xs of [] => [] | [x] => [x] | _ => let val (left, right) = split xs in merge (merge_sort(left), merge_sort(right)) end

Again, we can see by induction on the length of the input list that this
function works. For lists of length 0 or 1 it clearly works. For larger lists we
observe from the specification for `split`

that both `left`

and `right`

must contain some elements and together they contain all
the elements of `xs`

. By the inductive hypothesis, `merge_sort`

applied to each of these lists results in sorted lists. From the specification
for `merge`

the result must be a sorted list containing all the
elements of xs. Therefore the `merge_sort`

function will work
correctly.

Now let's show that `merge_sort`

is not only a *correct* but
also an *efficient* algorithm for sorting lists of numbers. We start by
observing without proof that the performance of the `split`

function
is linear in the size of the input list. This can be shown by the same approach
we will take for `merge`

, so let's just look at `merge`

instead.

The `merge`

function too is linear-time -- that is, *O*(*n*)
-- in the total length of the two input lists. We will first find a recurrence
relation for the execution time. Suppose the total length of the input lists is zero or
one. Then the function must execute one of the two *O*(1)
arms of the case expression. These take at most some time *c*_{0
}to execute. So we have

T(0) =c_{0}

T(1) =c_{0}

Now, consider lists of total length *n*.
The recursive call is on lists of total length *n*−1,
so we have

T(n) =T(n−1) +c_{1}

where *c*_{1} is an constant upper
bound on the time required to execute the if statement and the operator `::`

(which takes constant time for usual implementations of lists). This gives us a
recurrence relation to solve for *T*. We can apply the iterative method to
solve the recurrence relation by expanding out the recurrence relation inequalities for the first
few steps.

T(0) =c_{0}

T(1) =c_{0 }T(2) =T(1) +c_{1 }=c_{0 }+c_{1 }T(3) =T(2) +c_{1 }=c_{0 }+ 2c_{1 }T(4) =T(3) +c_{1 }=c_{0 }+ 3c_{1 ... }T(n) =T(n−1) +c_{1 }=c_{0 }+ (n−1)c_{1 }= (c_{0 }+ c_{1})_{ }+c_{1}n_{ }

We notice a pattern which the last line captures. Recall that *T*(*n*)
is *O*(*n*) if for all *n*
greater than some *n*_{0}, we can
find a constant *k* such that *T*(*n*)
< *kn*. For n at least 1, this is easily satisfied by setting *k*
= *c*_{0} + 2*c*_{1}. Or we can just remember
that any first-degree polynomial is *O*(*n*)
and also Θ(*n*).
An even simpler way to find the right bound is to observe that the choice of
constants c0 and c1 doesn't matter; if we plug in 1 for both of them we get *T*(1)
= 1, *T*(2)=2, *T*(3)=3,
etc., which is clearly *O*(*n*).

Now let's consider the `merge_sort`

function itself. Again, for
zero- and one-element lists we compute in constant time. For *n*-element
lists we make two recursive calls, but to sublists that are about half the size,
and calls to `split`

and `merge`

that each take Θ(*n*)
time. For simplicity we'll pretend that the
sublists are exactly half the size. The recurrence relation we obtain has this form:

T(0) =c_{0}

T(1) =c_{0 }T(n) = 2T(n/2) +c_{1}n +c_{2}n + c_{3}

Let's use the iterative method to figure out the running time of `merge_sort`

.
We know that any solution must work for arbitrary constants *c*_{0}
and *c*_{4}, so again we replace them
both with 1 to keep things simple. That leaves us with the following recurrence
equations to work with:

T(1) = 1

T(n) = 2T(n/2) +n

Starting with the iterative method, we can start expanding the time equation until we notice a pattern:

T(n) = 2T(n/2) +n

= 2(2T(n/4) +n/2) +n

= 4T(n/4) +n+n

= 4(2T(n/8) +n/4) +n+n

= 8T(n/8) +n+n+n

=nT(n/n) +n+ ... +n+n+n

=n+n+ ... +n+n+n

Counting the number of repetitions of *n* in
the sum at the end, we see that there are lg *n* + 1
of them. Thus the running time is *n*(lg *n*
+ 1) = *n* lg *n* + *n*. We
observe that *n *lg *n* + *n* < *n*
lg *n* + *n* lg *n* = 2*n* lg *n*
for *n*>0, so the running time is *O*(*n*
lg *n*). So now we've done the analysis by using the iterative
method, let's use induction to verify that the bound is correct. It will be
convenient to use a slightly different version of the induction proof technique
known as **strong** or **course-of-values induction**.

Considern_{0}= 2.

Property of n to prove:

For n>n, there exists_{0}

T(n) =nlgn+n

Proof by strong (course-of-values) induction on n

Base case:n= 1

T(1) = 1 = 1 lg 0 + 1

Induction Step:Induction Hypothesis:

T(k) =klgk+kfor allk∈n

Property to prove forn+1:

T(n+1) = (n+1) lg (n+1) + (n+1)

Proof:

T(n+1) = 2T((n+1)/2) + (n+1)= 2 ((

n+1)/2 lg ((n+1)/2) + (n+1)/2) + (n+1)(by induction hypothesis)= (

n+1)(lg ((n+1)/2)) + (n+1) + (n+1)= (

n+1)(lg(n+1) − 1) + 2(n+1)= (

n+1) lg(n+1) + (n+1)

Thus we have shown that merge sort is Θ(*n*
lg *n*).