CS 3110 Lecture 19
Analyzing running time with recurrence relations

Asymptotic complexity has limitations, but it is still a useful tool analyzing and improving program performance. The use of big-O notation simplifies the task of analyzing performance. For OCaml, we assume that all the reductions performed during evaluation take constant time, including all arithmetic operations and pattern matching.

This assumption may be surprising if you think about the substitution work that seems to be required by the substitution model of evaluation: a substituted variable may appear at many different places in the code. The OCaml implementation avoids doing the work of substitution, by instead keeping track of all in-scope substitutions in a separate environment that allows variables to be looked up when their bindings are needed.

The following multiplication routine is not very efficient:

let rec times1(a, b) =
  if (b = 0) then 0
  else a + times1(a, b-1)

What is the order of growth of the time required by times1 as a function of n, where n is the magnitude of the parameter b? Note that the "size" of a number can be measured either in terms of its magnitude or in terms of the number of digits (the space it takes to write the number down). Often the number of digits is used, but here we use the magnitude. Note that it takes only about log10 x digits to write down a number of magnitude x, thus these two measures are very different.

We assume that all the primitive operations in the times1 function (if, +, =, and -) and the overhead for function calls take constant time. Thus if n=0, the routine takes at most some constant time c1. If n>0, the time taken on an input of magnitude n is at most some constant c2, plus the time taken by the recursive call on n-1. In other words, there are constants c1 and c2 such that T(n) satisfies:

T(n) = T(n-1) + c1     for n > 0
T(0) = c2

This is a recurrence relation (or simply recurrence defining a function T(n). It simply states that the time to multiply a number a by another number b of size n > 0 is the time required to multiply a by a number of size n-1 plus a constant amount of work (the primitive operations performed).

The recurrence relation is an inductive definition of a function. This particular recurrence relation has a unique closed-form solution that defines T(n) without any recursion. In order to determine the running time of recursive functions we will write recurrences that describe the running time, find closed form solutions to them, and express those solutions in terms of Big-Oh notation.

Expanding out this particular recurrence gives some idea of how to convert it to closed form:

T(n) = T(n-1) + c1     for n > 0
     = T(n-2) + c1 + c1
     = T(n-3) + c1 + c1 + c1
     = T(n-k) + kc1    for n,k > 0 and k < n
     = T(0) + nc1    for n > 0
     = c2 + nc1    for n > 0

which is O(n).

More generally, we guess a closed form solution for a recurrence, and then prove by induction that it holds. Sometimes expanding out the recurrence for a few steps can help with guessing.

Now consider a different procedure for multiplying two numbers:

let rec times2(a, b) =
  if (b = 0) then 0
  else if even(b) then times2(a*2, b/2)
  else a + times2(a, b-1)

Again we want an expression for the running time in terms of n, the magnitude of the parameter b. We assume that double and half operations are constant time (these could be done in constant time using arithmetic shift) as well as the standard primitives. The recurrence relation for this problem is more complicated than the previous one:

T(n) = T(n-1) + c1   for n > 0 and n odd
T(n) = T(n/2) + c2   for n > 0 and n even
T(0) = c3

We somehow need to figure out how often the first versus the second branch of this recurrence relation will be taken. It's easy if n is a power of two, i.e. if n = 2m for some integer m. In this case, the second branch of will only get taken when n = 1, because 2m is even except when m = 0, i.e. when n = 1. Thus, for this special case we can use the expansion technique above to get a closed form in terms of m:

T(2m) = T(2m-1) + c2     for m > 0
     = T(2m-2) + c2 + c2
     = T(2m-m) + mc2    
     = T(1) + mc2    
     = T(0) + c1 + mc2   
     = c3 + c1 + mc2   

which is O(lg n). We can show that this is the case using induction.

Strong induction

Strong induction has the same steps as ordinary induction, but the induction hypothesis is a little different:

  1. State the proposition to be proved in terms of P(n)
  2. Base case: show P(n0) is true
  3. Induction hypothesis: Assume that P(m) is true for all n0mn. This is different from ordinary induction where we only get to assume that P(m) is true for m=n.
  4. Induction step: Using the induction hypothesis, prove P(n+1) is true.
  5. Conclusion:  P(n) is true for all nn0

We can formalize this process as an inference rule that gives us a way to prove a proposition ∀n.P(n) for some predicate P. Here are the rules for weak (ordinary) and strong induction respectively:
P(0) ∀n.P(n) ⇒ P(n+1)
∀n.P(n)
(weak induction)
(∀n'.n'<n⇒P(n')) ⇒ P(n)
∀n.P(n)
(strong induction)

It is often easier to prove asymptotic complexity bounds using strong induction than it is using ordinary induction, because you have a stronger induction hypothesis to work with: ∀n'.n'<n⇒P(n') rather than P(n).

Running time of times2

Let's return to analyzing times2. For the case where n is a power of 2 we have the following recurrence:

T(n) = T(n/2) + c2   for n > 0 and n even
T(1) = c4

We show c2 lg n + c4 is the solution to this recurrence using strong induction.

Base case: T(1) = c2 lg 1 + c4 = c4.

Induction hypothesis: T(n') = c2 lg n' + c4 for all n' < n.

Inductive step:

T(n) = T(n/2) + c2

=c2 lg(n/2) + c4 + c2 by I.H.

=c2(lg n - lg 2) + c4 + c2

=c2 lg n - c2 + c4 + c2

=c2 lg n + c4.

Thus in the case of an input of size n which is a power of 2, the running time of times2 is O(lg n).

In the general case, the running time of times2 is also O(lg n), but we need a slightly different recurrence to make this easier to show. Note that it is not possible for the parameter b to be odd on two successive recursive calls of times2, because when it is odd, one is subtracted making it even. Thus we can rewrite the original recurrence as:

T(n) = T((n-1)/2) + c1 + c2   for n > 0 and n odd
T(n) = T(n/2) + c2   for n > 0 and n even
T(1) = c4

Now using the techniques from above we can guess a closed form, and prove by strong induction that it holds. This is left as an exercise.

Note that this new occurrence is analogous to rewriting the code so that both recursive calls divide the input size in half:

let rec times3(a, b) =
  if (b = 0) then 0
  else if even(b) then times3(a*2, b/2)
  else a + times3(a*2, (b-1)/2)