CS410, Summer 1998
Lecture 18 Outline
Dan Grossman

Goals:
  * Improvements to Merge Sort
  * Quicksort

Improvements to Merge Sort

Yesterday we discussed that switching to insertion sort for small
subproblems can improve the overall running time by a constant factor.
Today we will discuss two more constant-factor improvements, neither
of which is completely obvious at first.

The first is when copying sorted halves into the auxiliary array, put
the right half in reverse order.  For example, if the two halves are 
1 2 5 7 and 3 4 8 9, copy so that the auxiliary array looks like
1 2 5 7 9 8 4 3.

This has the effect of putting the maximum element in the middle of
the array (since it is either the last element in the first half or
the first element in the second half).  As a result, this element
serves as a sentinel for whichever pointer during the merge would have
"fallen off the end" of the array.  So we never need to check for this
anymore!  The new merge looks like:
  int j = l;
  int k = r;
  for (int i = l; i <= r ; i++) {
	if (aux[k] < aux[j]) {
		a[i] = aux[k];
		k--;
	} else {
		a[i] = aux[j];
		j++;
	}
  }

The second improvement will eliminate half of the copying -- surely a
large constant factor improvement.  We modify mergesort so that it
takes two arrays and sorts one into the other.  That is, it is does not
put the answer where the input is.  The whole thing looks like:

mergesort(array a, array b, int l, int r): 
   // sort a from l to r, putting answer in b from l to r
   if (l < r) {
	m = (l+r)/2;
	mergesort(b, a, l, m);   // notice we flip role of a and b
	mergesort(b, a, m+1, r); // notice we flip role of a and b
	merge(a, b, l, r);
    }
merge(source, dest, l, r):
   same as basic version, but don't copy -- just merge into dest

sort(array a):
  copy a into b
  call mergesort(b, a, 0, a.length-1)

We went through a long example in class to get some intuition that
this actually works.  Just think inductively -- after the subproblems,
the array has the sorted halves due to recursion.

Using both improvements together is a bit tricky.

Quicksort

Let partition be a function that returns an integer i and guarantees:
* all elements to the left of array[i] are less than array[i]
* all elements to the right of array[i] are greater than array[i]

Then array[i] has the correct element and to sort the whole array we
just need to sort the parts on the right and left.  This is the whole
idea behind quicksort:

quicksort (array, l, r) :
   if (l < r) {
	i = partition(a, l, r);
	quicksort(a, l, i-1);
	quicksort(a, i+1, r)
   }

We can implement partition in O(n) time where n is r-l.  The idea is
to have two pointers moving from the ends.  When the one on the left
points to something too big and the one on the right points to
something too small, we swap and continue:

partition (array, l, r) :
  int p = pickPivot(l, r);
  swap(array, l, p);  // put the pivot element in the first position
  int i = l+1;
  int j = r;
  while (true) {
	while (i <= r && a[i] < a[l])
	   i++;
	while (a[j] > a[l]) // needn't check j<=l b/c j==l will make the 
	   j--;		    // test false anyway
	if (j <= i)
	  break;
	swap(array, i, j);
  }
  swap(array, l, j);  // put the pivot in the correct place
  return j;

For example, for an array of 8 6 7 5 3 0 9 and p of 2, the array ends up
looking like 3 6 0 5 7 8 9 and the return value is 4.

Quicksort is in-place, unstable, and list unfriendly.  In practice, it
is generally faster than all of the other methods we have learned.

The running time of quicksort on an array of size n can be expressed
by the recurrence:

T(n) = T(left side of partition) + T(right side of partition) + O(n)

The best case is when the partition is in the middle every time:

T(n) = T(n/2) + T(n/2) + O(n) = 2T(n/2) + O(n) = O(nlog n)

The worst case is then the partition is all the way to one side every time:

T(n) = T(n-1) + O(1) + O(n) = O(n^2)

We will show that we can expect behavior much closer to the best case
than the worst.  For intuition, realize that we don't have have to be
right at the middle to get O(nlog n) behavior.  Even if the split were
always 9/10 on one side and 1/10 on the other, we would be find:

T(n) = T(9n/10) + T(n/10) + O(n) = O(nlog n)

In fact, any _constant factor_ split will be O(n log n).  It is only
when the split is expressed in terms of n (for example "3 and (n-3)") that
the asymptotic time changes.

In fact, however, we do not get the same split every time.  This
complicates the analysis, but actually improves the expected behavior.
The intuition is that a few bad splits won't hurt us as long as they
are rare.

All of this leads to an important conclusion on how to implement
pickPivot: choose a number between l and r AT RANDOM.  Since most
numbers do not produce bad splits, if pivot elements are chosen at
random, we expect O(nlog n) behavior.  This is proven rigorously in
the text; we sketch the argument here.

Claim: Assuming every call to partition chooses the pivot element
uniformly at random, the expected running time for quicksort is O(n
log n).

Proof Sketch:
Expected T(n) = (1/n)(sum from q=1 to (n-1) of T(q) + T(n-q)) + O(n)

This is just the recurrence re-written to expect each possible split
with probability 1/n.  Now we notice that the summation actually has
two T(i) terms for every i between 1 and n-1.  So we can simplify to:

Expected T(n) = (2/n)(sum from q=1 to (n-1) of T(q)) + O(n).

Now we guess T(n) <= anlog n + b + cn and prove it by induction.
The base case is automatic as in all recurrence relations.
The inductive case begins:

T(n) = (2/n)(sum from q=1 to (n-1) of T(q)) + cn

     <= (2/n)(sum from q=1 to (n-1) of (aqlogq + b)) + cn        by I.H.

     = (2a/n)(sum from q=1 to (n-1) of (qlogq)) + (2b/n)(n-1) + cn

At this point it suffices to prove that the summation of (q log q) is less
than (1/2)n^2 log n - (1/8) n^2.  The details are in the text.

It is easy to see that the summation is O( n^2 log n) because the log
q term is always less than log n and the sum of the first n integers
is O(n^2).  This is not sufficient for the inductive proof because we
must use the same a, b, and c (see lecture 1).