Sorting
Reading: CLR [1,7], Sedgewick [6,8,9].
Today: comparison sorts

Selection sort
Bubblesort
Insertion sort
Mergesort
Heapsort
Quicksort

Issues:
   * Is the method stable?  stable = any two elements with
     equal keys occur in the same order in the input and output.
   * Is the method "list-friendly" or is the best way to sort lists using the
     method to convert the list to an array?
   * Is the method "in-place" or do we need more than O(1) additional space?
   * What is the best-case/worst-case/expected running time?  What
     input data exhibits the best-case and worst-case?
   * How do the constant factors hidden by big-O compare between methods?

Selection sort

Invariant: after i stages,
a[0],...,a[i-1] are sorted
a[j] <= a[k] for all j < i and k >= i.

i-th stage: find smallest element of a[i],...,a[n-1] and swap with a[i].

void selectionSort(int[] a) {
  for (i = 0; i < a.length-1; i++) {
    int min = i;
    for (j = i+1; j < a.length; j++) {
      if (a[j] < a[i]) min = j;
    }
    swap(a,i,j);
  }
}

-- best case O(n^2)
-- stable, since we always take first occurrence of minimum key
-- in-place and list-friendly

Bubblesort

Invariant: after i stages,
a[0],...,a[i-1] are sorted
a[j] <= a[k] for all j < i and k >= i
(same as selection sort).

ith stage: starting at end of array,
"bubble" small elements down toward a[i]

void bubbleSort(int[] a) {
  for (i = 0; i < a.length-1; i++) {
    for (j = a.length-1; j > i; j--) {
      if (a[j] < a[j-1]) swap(a,j,j-1);
    }
  }
}

-- stable, list-friendly, in-place
-- best-case O(n^2)
-- constant worse than selection sort, since doing more swaps

Insertion Sort

Invariant: after i stages,
a[0],...,a[i-1] are sorted

i-th stage: swap a[i] down until it settles in its
correct place among a[0],...,a[i-1].

void insertionSort(int[] a) {
  for (i = 1; i < a.length; i++) {
    if (a[i] < a[i-1]) {
      int temp = a[i];
      for (j = i-1; j >= 0 && a[j] > temp; j--) {
        a[j+1] = a[j];
      }
      a[j+1] = temp;
    }
  }
}

-- stable because the we use strict inequality a[j] > temp
-- list-friendly, in-place.

Best-case O(n), which occurs when the list is already sorted.
It is O(n) because the inner loop is never executed.
Worst-case O(n^2), which occurs when the list is reverse-sorted.
Expected time on a random permutation is O(n^2), since we
expect the i-th iteration to execute the inner loop i/2 times.

Heapsort

We actually already saw a way to sort when we studied heaps:

Build a heap using heapify--O(n)
Delete-min until the heap is empty--O(n log n)

When we call delete-min for the ith time, we can put the element
in a[n-i].  Hence the method is in-place.

This is not list-friendly--we cannot efficiently follow the
implicit tree pointers of the heap using a list.

This is not stable.  Here is an example (where letters reveal the
original order of elements with equal keys).

        1                2a
       / \              /  \
     2a  2b    ===>    2c  2b
     / \              /
   2c  2d            2d

We needed 2a to be the new root, so our policy for resolving ties,
must have been, "resolve in favor of the left child".  But now we're
stuck with this (our heap can't change the policy based on what it
needs because it doesn't know the letters!).  The next delete-min will
then cause

===>    2c
       /  \
      2d  2b

So 2c will appear before 2b -- the sort is not stable.

Mergesort

Classic divide-and-conquer.  Divide the
input in half, recursively sort each half,
then merge the sorted subarrays in linear time.

//0 <= left <= right < a.length
//sort the subarray a[left,...,right-1]
void mergesort(int[] a, int left, int right) {
  if (right - left < 2) return;
  m = (left+right)/2;
  mergesort(a,l,m);
  mergesort(a,m,r);
  merge(a,l,m,r);
}

The merge operation runs in linear time O(n), but it uses an
auxiliary array.  We we walk down each sorted subarray and just
compare two pointers to see which goes next in the temp array.  Then
when we're done, we copy everything back to the original array.

So the running time for mergesort on an array of size n is
   T(n) = 2T(n/2) + O(n)
which is O(n log n).

This method is stable, but it is not in-place.

It is sort of list-friendly.  The only hard part is finding the middle
of the list.  This will take O(n) at each iteration, so there is no
change in the asymptotic complexity.

Improvements to mergesort:

There is no reason the recursive calls to sort smaller sub-arrays have
to use the same sorting method.  All we care is that the smaller array
is sorted.  It turns out that for small n (say less than ten or so),
insertion sort is faster than merge sort.  (n^2 > nlog n, but for
small n, the hidden constant factors make a difference).  So we could
change mergesort to be

if (right-left <= 10) {
  insertionSort(a,left,right);
} else {
  do the usual mergesort as above

assuming insertionSort is appropriately modified to handle subarrays.

Note that this eliminates roughly 15/16ths of all the calls to merge
sort at the expense of having to do roughly n/10 insertion sorts on
small arrays.  This is based on the observation that almost all of the
recursive calls are on array of small size.  (Draw the recursion tree
to see why!)  This is generally a win in practice.

Here are two more constant-factor improvements, neither
of which is completely obvious at first.

The first is when copying sorted halves into the auxiliary array, put
the right half in reverse order.  For example, if the two halves are 
1 2 5 7 and 3 4 8 9, copy so that the auxiliary array looks like
1 2 5 7 9 8 4 3.

This has the effect of putting the maximum element in the middle of
the array (since it is either the last element in the first half or
the first element in the second half).  As a result, this element
serves as a sentinel for whichever pointer during the merge would have
"fallen off the end" of the array.  So we never need to check for this
anymore!  The new merge looks like:
  int j = l;
  int k = r;
  for (int i = l; i <= r ; i++) {
        if (aux[k] < aux[j]) {
                a[i] = aux[k];
                k--;
        } else {
                a[i] = aux[j];
                j++;
        }
  }

The second improvement will eliminate half of the copying -- surely a
large constant factor improvement.  We modify mergesort so that it
takes two arrays and sorts one into the other.  That is, it is does not
put the answer where the input is.  The whole thing looks like:

mergesort(array a, array b, int l, int r): 
   // sort a from l to r, putting answer in b from l to r
   if (l < r) {
        m = (l+r)/2;
        mergesort(b, a, l, m);   // notice we flip role of a and b
        mergesort(b, a, m, r); // notice we flip role of a and b
        merge(a, b, l, r);
    }
merge(source, dest, l, r):
   same as basic version, but don't copy -- just merge into dest

sort(array a):
  copy a into b
  call mergesort(b, a, 0, a.length-1)

Using both improvements together is a bit tricky.

Quicksort
  * worst case O(n^2); expected O(n log n)
  * very small constant
  * in-place

Idea: Say we want to sort the subrange of vector v between p and 
r, inclusive.  Pick some element of the subrange and call it the
pivot.  Move all elements in the subrange that are <= the pivot
to the beginning of the subrange, say between p and q.  Move all
elements > the pivot to the end of the range, say between q+1 and
r.  Recursively sort the subranges between p and q and between
q+1 and r.

Let partition be a function that returns an integer i and guarantees:
* all elements to the left of array[i] are less than array[i]
* all elements to the right of array[i] are greater than array[i]

Then array[i] has the correct element and to sort the whole array we
just need to sort the parts on the right and left.  This is the whole
idea behind quicksort:

quicksort (array, l, r) :
   if (l < r) {
        i = partition(a, l, r);
        quicksort(a, l, i-1);
        quicksort(a, i+1, r)
   }

We can implement partition in O(n) time where n is r-l.  The idea is
to have two pointers moving from the ends.  When the one on the left
points to something too big and the one on the right points to
something too small, we swap and continue:

partition (array, l, r) :
  int p = pickPivot(l, r);
  swap(array, l, p);  // put the pivot element in the first position
  int i = l+1;
  int j = r;
  while (true) {
        while (i <= r && a[i] < a[l])
           i++;
        while (a[j] > a[l]) // needn't check j<=l b/c j==l will make the 
           j--;                    // test false anyway
        if (j <= i)
          break;
        swap(array, i, j);
  }
  swap(array, l, j);  // put the pivot in the correct place
  return j;

Quicksort is in-place, unstable, and list unfriendly.  In practice, it
is generally faster than all of the other methods we have learned.

Example.  Sort

3 5 4 7 0 8 2 1 9 6
^                 ^
p                 r

Pick v[p]=3 as the pivot.  Now start at i=p (respectively, 
j=r) and count up (respectively, down) until we find an element 
that is at least as big (respectively, small) as the pivot.

3 5 4 7 0 8 2 1 9 6
^             ^
i             j

Now we switch those two elements in place.

1 5 4 7 0 8 2 3 9 6
^             ^
i             j

Now we increment i and decrement j

1 5 4 7 0 8 2 3 9 6
  ^         ^
  i         j

and repeat.  Find the first element starting at i and moving right 
that is >= the pivot and the first element starting at j and 
moving left that is <= the pivot

1 5 4 7 0 8 2 3 9 6
  ^         ^
  i         j

and exchange them.

1 2 4 7 0 8 5 3 9 6
  ^         ^
  i         j

Repeat.

1 2 4 7 0 8 5 3 9 6
    ^   ^
    i   j

1 2 0 7 4 8 5 3 9 6
    ^   ^
    i   j

1 2 0 7 4 8 5 3 9 6
    ^ ^
    j i

We're done when j is to the left of i.  Everything from j left is 
<= the pivot, and everything from j+1 right is >= the pivot.

1 2 0 7 4 8 5 3 9 6
    ^ ^
    j j+1

We recursively sort these two subarrays.

0 1 2 3 4 5 6 7 8 9
^   ^ ^           ^
p   j j+1         r

The running time depends on how balanced the partitions are.

BEST CASE:
The pivot is always the median--we cut the array in half on each iteration.  
  T(n) = O(n) + 2T(n/2) = O(n log n)

WORST CASE: The pivot is always the smallest element.  
The split is 1:n-1.  Time = O(n^2)

These are the extreme cases.  What's in between?

Suppose that the partition produces a 9:1 split -- 90% in one 
half, 10% in the other.

That sounds pretty unbalanced, but actually it's still O(n log n).

  T(n) = T(0.9 n) + T(0.1 n) + O(n)
      = O(n log n)

Detailed analysis next time.