Recitation 16: Amortized Analysis Examples

Amortized analysis refers to determining the time-averaged running time for a sequence of operations. It is different from what is commonly referred to as average case analysis, because amortized analysis does not make any assumption about the distribution of the data values, whereas average case analysis assumes the data are not "bad" (e.g., some sorting algorithms do well on "average" over all input orderings but very badly on certain input orderings). That is, amortized analysis is a worst case analysis, but for a sequence of operations, rather than for individual operations. It uses the fact that we are analyzing a sequence to "spread out" the costs (think of insurance where everyone can pay a relatively modest amount despite some catastrophic costs).

The motivation for amortized analysis is to better understand the running time of certain techniques, where standard worst case analysis provides an overly pessimistic bound. Amortized analysis generally applies to a method that consists of a sequence of operations, where the vast majority of the operations are cheap, but some of the operations are expensive. If we can show that the expensive operations are particularly rare we can "charge them" to the cheap operations, and only bound the cheap operations.

The general approach is to assign an artificial cost to each operation in the sequence, such that the total of the artificial costs for the sequence of operations bounds total of the real costs for the sequence. This artificial cost is called the amortized cost of an operation. In order to analyze the running time, the amortized cost thus is a correct way of understanding the overall running time - but note that particular operations can still take longer so it is not a way of bounding the running time of any individual operation in the sequence.

Three approaches are commonly used for amortized analysis, and have been termed:

the aggregate method,
the banker's method (tokens),
the physicist's method (potential functions).

Binary Counter

Consider the problem of storing a very large binary counter. Say we decide to use an array, where each entry A[i] stores the i-th bit.

We will analyze the running time of the operation of counting using this representation, so the sequence of operations is a sequence of increments of the counter.

We will use the standard way of incrementing the counter, which is to toggle the lowest order bit. If that bit switches to a 0 we toggle the next higher order bit, and so forth until the bit that we toggle switches to a 1 at which point we stop.

  A[m]  A[m-1] ...    A[3]  A[2]  A[1]  A[0]      cost
--------------------------------------------      ----
    0     0             0     0     0     0
                                                    
    0     0             0     0     0     1        1
                                                     
    0     0             0     0     1     0        2
                                                     
    0     0             0     0     1     1        1
                                                     
    0     0             0     1     0     0        3
                                                     
    0     0             0     1     0     1        1
                                                     
    0     0             0     1     1     0        2

When the result of the increment operation is n, the number of bits that change is at most 1+floor(lg n), that is the number of bits in the binary representation of n.

Thus in a traditional worst case analysis, the cost of counting up to n, which a sequence of n increments, is O(n log n).

But do you really think this is taking that much time? Does each increment in the sequence really cost O(log n)? How do we show its less? This is the goal of amortized analysis.

Aggregate Method. Let's consider how often we flip each individual bit, and sum those up to bound the total, rather than individually obtaining a worst case bound for each bit. How often is A[0] toggled in counting up to n? How often is A[1] toggled, A[2] toggled, etc? Every time, every other time, every fourth time, ...

            n + floor(n/2) + floor(n/4) + ...
    <=   n + n/2 + n/4 + ....
    <=   2n

So the amortized cost of an increment is 2, and the time to count to n (n increments) is 2n=O(n).

Banker's method. This approach is more akin to an insurance view, or an accounting view. The idea is to see how much each individual operation needs to "pay" in order for there to be enough "money" to do all the work. The only work here is toggling a bit, so lets say that it costs $1 to toggle a bit. The question is, how much money do we need to allocate to each increment to have banked up enough to pay for all the toggles as they occur.

Of course having just used the aggregate method, $2 would be a good guess (though not a guarantee, maybe we run out along the way at some point even though there is enough money later).

What happens on an increment? Zero or more bits change from a 1 to a 0, and then one bit changes from a 0 to a 1. Let's say we want to leave $1 at each 1-bit so that when it needs to turn to 0 we can pay for the work (because there may be many of those so we need to pay up in advance). Then a simple scheme, costing $2 per increment, is to use up the $1 to flip each 1-bit to a 0, and then pay $1 to flip the high-order 0-bit to a 1 and another $1 to leave there for later use.

This shows that the amortized cost of the increment is 2, and the overall time is O(n) for n increments.

Queue

Earlier in the semester we saw a way of implementing a queue (FIFO) using two stacks (LIFO). Say that our stack has three operations, push, pop and empty, each with cost 1.

We saw that a queue can be implemented as

enqueue(x): push x onto stack1
dequeue(): if stack2 is empty then pop the entire contents of stack1 pushing each element in turn onto stack 2. Now pop from stack2 and return the result.

We've seen earlier that this algorithm is correct, now we will consider the running time in more detail.

A conventional worst case analysis would establish that dequeue takes O(n) time but this is clearly a weak bound for a sequence of operations because very few dequeues can take that long. That is, O(n^2) is not a very accurate characterization of how long a sequence of n enqueue and dequeue operations will take, even though in the worst case an individual dequeue can certainly take O(n) time.

To simplify the amortized analysis we will consider only the cost of the push and pop operations and not of checking whether stack2 is empty, as that operation clearly costs 1 per dequeue.

Aggregate method. As each element is processed, it is clearly pushed at most twice and popped at most twice. However, if an element is enqueued and never dequeued then it is pushed at most twice and popped at most once. Thus the cost of each enqueue is 3 and of each dequeue is 1.

Banker's method. Allocate $2 to each element stored in the queue structure. This will cover the cost of popping it and pushing it from stack1 to stack2 if that ever needs to be done. There is an additional cost of $1 for the initial push onto stack1, for a cost of $3. The dequeue operations cost $1 to pop from stack2.

Note that the analysis in both cases seems to charge more for storing than removing, even though in the code it is the other way around. Amortized analysis bounds the overall sequence, which in this case depends on how much stuff is stored in the data structure. It does not bound the individual operations.

Potential Functions (Physicist's Method)

This is a more formal version of the banker's method. Instead of distributing money across items in the data structure, we will keep money in a piggy bank, as what's really important is that we have enough money to pay for operations as they happen. We will call the amount of money in the bank the potential function, Φ, for the problem.

For each of the problems considered above we can write down a potential function.

For the counter problem:

Φ(counter) = # of 1 bits in the counter

For the queue implemented in terms of a stack:

Φ(queue) = 2 * size of stack 1

Say a data structure changes from state S to state S' as the result of doing some operation, we can define the amortized cost of the operation using Φ as

amortized cost = actual cost + Φ(S') - Φ(S)

Which is simply the amount of additional money that we need to maintain our piggy bank and pay for the work.

Note that if the potential is non-negative and starts at zero, then the sum of the actual costs is <= the sum of the amortized costs.