CS 3110 Recitation 21
Amortized Analysis Examples

(See: Introduction to Algorithms, Cormen, Leiserson and Rivest and Stein, Chap 17 of 2nd ed.)

Amortized analysis refers to determining the time-averaged running time for a sequence of operations.  It is different from what is commonly referred to as average case analysis, because amortized analysis does not make any assumption about the distribution of the data values, whereas average case analysis assumes the data are not "bad" (e.g., some sorting algorithms do well on "average" over all input orderings but very badly on certain input orderings).  That is, amortized analysis is a worst case analysis, but for a sequence of operations, rather than for individual operations.  It uses the fact that we are analyzing a sequence to "spread out" the costs (think of insurance where everyone can pay a relatively modest amount despite some catastrophic costs).

The motivation for amortized analysis is to better understand the running time of certain techniques, where standard worst case analysis provides an overly pessimistic bound.  Amortized analysis generally applies to a method that consists of a sequence of operations, where the vast majority of the operations are cheap, but some of the operations are expensive.  If we can show that the expensive operations are particularly rare we can "charge them" to the cheap operations, and only bound the cheap operations. 

The general approach is to assign an artificial cost to each operation in the sequence, such that the total of the artificial costs for the sequence of operations bounds total of the real costs for the sequence.  This artificial cost is called the amortized cost of an operation. In order to analyze the running time, the amortized cost thus is a correct way of understanding the overall running time - but note that particular operations can still take longer so it is not a way of bounding the running time of any individual operation in the sequence.

Three approaches are commonly used for amortized analysis, and have been termed:

Physicist's method on dynamically resizable arrays

In last lecture we saw the aggregate methods and the banker's method for dealing with dynamically resizable arrays. Here, let us have a look at the physicist's method on that same problem.

If we start from an empty array, and add an element n times, it will take O(n) time, even if we resize the array by doubling its size whenever it is full.

To see this we need to evaluate the amortized complexity of adding an element to the array. This formalizes the reasoning we used in the aggregate and banker's method before. We define a potential function Φ (read Phi) that measures the precharged time for a given state of the data structure. The potential function saves up time that can be used by later operations.

We then define the amortized time taken by a single operation that changes the data structure from h to h' as the actual time plus the change in potential, Φ(h') - Φ(h) . Now consider a sequence of n operations, taking actual times t1, t2, t3, ..., tn and producing arrays h0, h1, h2, ... hn. The amortized time taken by these operations is the sum of the actual times for each operation plus the sum of the changes in potential: t1 + t2 + ... tn + (Φ(h1)−Φ(h0)) + (Φ(h2) − Φ(h1)) + ... + (Φ(hn) − Φ(hn-1)) = t1 + t2 + ... tn + Φ(hn) − Φ(h0). Therefore the amortized time for a sequence of operations overestimates of the actual time by the maximum drop in the potential function Φ(hn) − Φ(h0) seen over the whole sequence of operations. If we can arrange that the maximum drop is zero, total amortized time is always an upper bound on the actual time, which is what we want.

The key to amortized analysis is to define the right potential function. The potential function needs to save up enough time to be used later when it is needed. But it cannot save so much time that it causes the amortized time of the current operation to be too high.

Let us do an amortized analysis on dynamically resizable arrays with resizing by doubling. We define the potential function so that it stores up time as the load factor moves away from 1/2. Then there will be enough stored-up time to resize in either direction. If we call n the number of elements in the array and m the number of slots in the array, the potential function is:

Φ(h) = 2(n - m/2)

(because we are always adding elements to the array, it always holds that n≥m/2)

Now we just have to consider all the cases that can occur with all the possible operations. In this simple framework, adding an element is the only possible operation. If we add an element to an array already containing n elements and m slots, it increases n by one, leading to having n+1 elements. Two cases are to be considered:

In each case, the amortized time is O(1). If we start our dynamic array as empty, then its initial potential will be zero. So we know that the potential can never decrease, and amortized time will be an upper bound on actual time. Therefore a sequence of n operations will take O(n) time.

Binary Counter

Consider the problem of storing a very large binary counter.  Say we decide to use an array, where each entry A[i] stores the i-th bit. 

We will analyze the running time of the operation of counting using this representation, so the sequence of operations is a sequence of increments of the counter.

We will use the standard way of incrementing the counter, which is to toggle the lowest order bit.  If that bit switches to a 0 we toggle the next higher order bit, and so forth until the bit that we toggle switches to a 1 at which point we stop.

  A[m]  A[m-1] ...    A[3]  A[2]  A[1]  A[0]      cost
--------------------------------------------      ----
    0     0             0     0     0     0
                                                    
    0     0             0     0     0     1        1
                                                     
    0     0             0     0     1     0        2
                                                     
    0     0             0     0     1     1        1
                                                     
    0     0             0     1     0     0        3
                                                     
    0     0             0     1     0     1        1
                                                     
    0     0             0     1     1     0        2

When the result of the increment operation is n, the number of bits that change is at most 1+floor(lg n), that is the number of bits in the binary representation of n.

Thus in a traditional worst case analysis, the cost of counting up to n, which a sequence of n increments, is O(n log n).

But do you really think this is taking that much time?  Does each increment in the sequence really cost O(log n)?  How do we show its less?  This is the goal of amortized analysis.

Aggregate Method.  Let's consider how often we flip each individual bit, and sum those up to bound the total, rather than individually obtaining a worst case bound for each bit.  How often is A[0] toggled in counting up to n?  How often is A[1] toggled, A[2] toggled, etc?  Every time, every other time, every fourth time, ...

            n + floor(n/2) + floor(n/4) + ...
    <=   n + n/2 + n/4 + ....
    <=   2n

So the amortized cost of an increment is 2, and the time to count to n (n increments) is 2n=O(n).

Banker's method. This approach is more akin to an insurance view, or an accounting view.  The idea is to see how much each individual operation needs to "pay" in order for there to be enough "money" to do all the work.  The only work here is toggling a bit, so lets say that it costs $1 to toggle a bit.  The question is, how much money do we need to allocate to each increment to have banked up enough to pay for all the toggles as they occur.

Of course having just used the aggregate method, $2 would be a good guess (though not a guarantee, maybe we run out along the way at some point even though there is enough money later).

What happens on an increment?  Zero or more bits change from a 1 to a 0, and then one bit changes from a 0 to a 1.  Let's say we want to leave $1 at each 1-bit so that when it needs to turn to 0 we can pay for the work (because there may be many of those so we need to pay up in advance).   Then a simple scheme, costing $2 per increment, is to use up the $1 to flip each 1-bit to a 0, and then pay $1 to flip the high-order 0-bit to a 1 and another $1 to leave there for later use.

This shows that the amortized cost of the increment is 2, and the overall time is O(n) for n increments.

Queue

Earlier in the semester we saw a way of implementing a queue (FIFO) using two stacks (LIFO).  Say that our stack has three operations, push, pop and empty, each with cost 1.

We saw that a queue can be implemented as

We've seen earlier that this algorithm is correct, now we will consider the running time in more detail. 

A conventional worst case analysis would establish that dequeue takes O(n) time but this is clearly a weak bound for a sequence of operations because very few dequeues can take that long.  That is, O(n^2) is not a very accurate characterization of how long a sequence of n enqueue and dequeue operations will take, even though in the worst case an individual dequeue can certainly take O(n) time.

To simplify the amortized analysis we will consider only the cost of the push and pop operations and not of checking whether stack2 is empty, as that operation clearly costs 1 per dequeue.

Aggregate method.  As each element is processed, it is clearly pushed at most twice and popped at most twice.  However, if an element is enqueued and never dequeued then it is pushed at most twice and popped at most once. Thus the cost of each enqueue is 3 and of each dequeue is 1.

Banker's method. Allocate $2 to each element stored in the queue structure.  This will cover the cost of popping it and pushing it from stack1 to stack2 if that ever needs to be done.  There is an additional cost of $1 for the initial push onto stack1, for a cost of $3.  The dequeue operations cost $1 to pop from stack2.

Note that the analysis in both cases seems to charge more for storing than removing, even though in the code it is the other way around.  Amortized analysis bounds the overall sequence, which in this case depends on how much stuff is stored in the data structure.  It does not bound the individual operations.

Potential Functions (Physicist's Method)

This is a more formal version of the banker's method.  Instead of distributing money across items in the data structure, we will keep money in a piggy bank, as what's really important is that we have enough money to pay for operations as they happen.  We will call the amount of money in the bank the potential function, Φ, for the problem.

For each of the problems considered above we can write down a potential function.

For the counter problem:

     Φ(counter) = # of 1 bits in the counter

For the queue implemented in terms of a stack:

     Φ(queue) = 2 * size of stack 1

Say a data structure changes from state S to state S' as the result of doing some operation, we can define the amortized cost of the operation using Φ as

amortized cost = actual cost + Φ(S') - Φ(S)

Which is simply the amount of additional money that we need to maintain our piggy bank and pay for the work.

Note that if the potential is non-negative and starts at zero, then the sum of the actual costs is <= the sum of the amortized costs.