Recitation 23: Recurrences

Asymptotic complexity has limitations, but it is still a useful tool analyzing and improving program performance. The use of big-O notation simplifies the task of analyzing performance. For OCaml, we assume that all the reductions performed during evaluation take constant time, including all arithmetic operations and pattern matching.

This assumption may be surprising if you think in terms of the substitution model. Substitution may require a lot of work, because a substituted variable may appear at many different places in the code. However, the OCaml implementation is closer to the environment model of evaluation, in which the actual substitutions are never done, but bindings are recorded in a separate environment for later lookup. Environments can be implemented so that all environment operations, including lookup, can be performed in constant time.

The following multiplication routine for nonnegative integers is correct, but not very efficient:

let rec times1 a b =
  if b = 0 then 0
  else a + times1 a (b - 1)

What is the order of growth of the time required by times1 as a function of n, where n is the magnitude of the parameter b? Note that the "size" of a number can be measured either in terms of its magnitude or in terms of the number of digits (the space it takes to write the number down). Often the number of digits is used, but here we use the magnitude. Note that it takes only about log₁₀ x digits to write down a number of magnitude x, thus these two measures are very different.

We assume that all the primitive operations in the times1 function (if, +, =, and -) and the overhead for function calls take constant time. Thus if n=0, the routine takes some constant time c₂. If n>0, the time taken on an input of magnitude n is at most some constant c₁ plus the time taken by the recursive call on n-1. In other words, there are constants c₁ and c₂ such that T(n) satisfies:

T(n) = T(n-1) + c₁     for n > 0
T(0) = c₂

This is a recurrence relation (or simply recurrence) defining a function T(n). It simply states that the time to multiply a number a by another number b of size n > 0 is the time required to multiply a by a number of size n-1 plus a constant amount of work (the primitive operations performed).

The recurrence relation is an inductive definition of a function. This particular recurrence relation has a unique closed-form solution that defines T(n) without any recursion. In order to determine the running time of recursive functions we will write recurrences that describe the running time, find closed form solutions to them, and express those solutions in terms of Big-O notation.

Expanding out this particular recurrence gives some idea of how to convert it to closed form:

T(n) = T(n-1) + c₁     for n > 0
     = T(n-2) + c₁ + c₁
     = T(n-3) + c₁ + c₁ + c₁
     = ...
     = T(n-k) + kc₁    for 0 ≤ k ≤ n
     = ...
     = T(0) + nc₁      for n ≥ 0
     = c₂ + c₁n        for n ≥ 0

which is O(n).

More generally, we guess a closed form solution for a recurrence, and then prove by induction that it holds. Sometimes expanding out the recurrence for a few steps can help with the guessing.

Now consider a different procedure for multiplying two numbers:

let rec times2 a b =
  if b = 0 then 0
  else if even b then times2 (a*2) (b/2)
  else a + times2 a (b-1)

Again we want an expression for the running time in terms of n, the magnitude of the parameter b. We assume that double and half operations are constant time (these could be done in constant time using arithmetic shift) as well as the standard primitives. The recurrence relation for this problem is more complicated than the previous one:

T(n) = T(n-1) + c₁   for n > 0 and n odd
T(n) = T(n/2) + c₂   for n > 0 and n even
T(0) = c₃

We somehow need to figure out how often the first versus the second branch of this recurrence relation will be taken. It's easy if n is a power of two, i.e. if n = 2^m for some integer m. In this case, the second branch of will only get taken when n = 1, because 2^m is even except when m = 0, i.e. when n = 1. Thus, for this special case we can use the expansion technique above to get a closed form in terms of m:

T(2^m) = T(2^m-1) + c₂     for m > 0
     = T(2^m-2) + c₂ + c₂
     = ...
     = T(2^m-m) + mc₂    
     = T(1) + mc₂    
     = T(0) + c₁ + mc₂   
     = c₃ + c₁ + mc₂

which is O(log n). We can show that this is the case using induction.

Strong Induction

Strong induction has the same steps as ordinary induction, which you will recall from CS 2800, but the induction hypothesis is a little different:

State the proposition to be proved in terms of P(n)
Base case: show P(n₀) is true
Induction hypothesis: Assume that P(m) is true for all n₀ ≤ m ≤ n. This is different from ordinary induction where we only assume that P(m) is true for m = n.
Induction step: Using the induction hypothesis, prove P(n+1) is true.
Conclusion: P(n) is true for all n ≥ n₀.

We can formalize this process as an inference rule that gives us a way to prove a proposition ∀n.P(n) for some predicate P. Here are the rules for weak (ordinary) and strong induction respectively:

Weak induction:

P(0) ∀n (P(n) ⇒ P(n+1))

∀n P(n)
Strong induction:

∀n (∀m (m < n ⇒ P(m)) ⇒ P(n))
∀n P(n)

Note that the base case does not occur explicitly in the strong induction rule. However, it is still there implicitly: for n=0, you must prove ∀m (m < 0 ⇒ P(m)) ⇒ P(0), which reduces to P(0) since ∀m (m < 0 ⇒ P(m)) is vacuously true, as there are no m < 0.

In practice, it is often easier to prove asymptotic complexity bounds with strong induction than with weak induction, because you have a stronger induction hypothesis to work with, namely ∀m m < n ⇒ P(m) rather than P(n). However, the two rules are logically equivalent: one can prove the validity of the strong rule using the weak rule with the induction hypothesis P'(n) = ∀m (m < n ⇒ P(m)).

Running Time of times2

Let's return to the analysis of times2. When n is a power of 2, we have the following recurrence:

T(n) = T(n/2) + c₂   for n > 1 and n a power of 2
T(1) = c₄            where c₄ = c₃ + c₁

We show T(n) = c₂ log n + c₄ is the solution to this recurrence using strong induction.

Base case: T(1) = c₄ = c₂ log 1 + c₄.

Induction hypothesis: T(m) = c₂ log m + c₄ for all 1 ≤ m < n.

Inductive step:

T(n) = T(n/2) + c₂

= c₂ log(n/2) + c₄ + c₂ by I.H.

= c₂(log n − log 2) + c₄ + c₂

= c₂ log n − c₂ + c₄ + c₂

= c₂ log n + c₄.

Thus in the case of an input of size n which is a power of 2, the running time of times2 is O(log n).

In the general case, the running time of times2 is also O(log n), but we need a slightly different recurrence to make this easier to show. Note that it is not possible for the parameter b to be odd on two successive recursive calls of times2, because when it is odd, one is subtracted making it even. Thus we can rewrite the original recurrence as:

T(n) = T((n−1)/2) + c₁ + c₂  for n > 1 and n odd
T(n) = T(n/2) + c₂           for n > 1 and n even
T(1) = c₄
T(0) = c₃

Note that this new occurrence is analogous to rewriting the code so that both recursive calls divide the input size in half:

let rec times3 a b =
  if b = 0 then 0
  else if even b then times3 (a*2) (b/2)
  else a + times3 (a*2) ((b-1)/2)

It is a little difficult to come up with a closed form solution to this recurrence. However, using strong induction, we can show that whatever it is, it is bounded asymptotically by a function of the form d log n + e for sufficiently large constants d and e, and that is good enough to show that the solution is O(log n). So we want to show by strong induction that T(n) ≤ d log n + e for some (as yet to be determined) constants d and e. To find suitable d and e, we'll just jump in and see what we need to make the proof work.

Base case: We want T(1) = c₄ ≤ d log 1 + e. This will be true provided we choose e ≥ c₄.

Induction hypothesis: T(m) ≤ d log m + e for all 1 ≤ m < n.

Inductive step, n even:

T(n) = T(n/2) + c₂

≤ d log(n/2) + e + c₂ by I.H.

= d log n − d + e + c₂

≤ d log n + e, provided we choose d ≥ c₂.

Inductive step, n odd:

T(n) = T((n−1)/2) + c₁ + c₂

≤ d log((n−1)/2) + e + c₁ + c₂ by I.H.

= d log(n−1) − d + e + c₁ + c₂

≤ d log n − d + e + c₁ + c₂

≤ d log n + e, provided we choose d ≥ c₁ + c₂.

Choosing d and e just large enough to satisfy all these conditions, we see that T(n) ≤ (c₁ + c₂) log n + c₄ for all n ≥ 1, therefore T(n) is O(log n).