Single-source Shortest Path

-BFS for unweighted
-Dijkstra's algorithm for weighted

Let G be a directed graph and s be a vertex in G.  We want to find
the shortest path from s to every other vertex in G.  This is the
single-source shortest path problem.  For the unweighted problem,
"shortest" means fewest edges.  For the weighted problem, we assume
the edges have weights (distances), and "shortest" means the sum of
the weights along the path is minimum.  The problem is
ill-defined if there are negative-weight cycles in the graph.
We will assume all weights are positive.  The text has an
algorithm that works even with negative weights.

The unweighted problem can be solved in linear time with BFS
(see CLR 23.2 pp. 469ff.)

The weighted problem is a little harder.  The unweighted problem
reduces to the weighted problem by assigning all edges weight 1.

Note several similar problems:
* single-destination shortest path: we can solve this by solving
  single-source shortest path on the transpose of G
* single-source single-destination shortest path: nobody knows how to solve
  this any faster than single-source shortest path in the worst case.
* all pairs shortest path: next lecture

Lemma: Subpaths of shortest paths are shortest paths.
Proof: 
   Assume the shortest path from u to v has a subpath w_1 ---> w_2
   If it does not take the shortest path from w_1 --> w_2, then by taking
   the shortest path we can make a shorter u --> v path.  This contradicts
   the assumption that we have the shortest u --> v path.

Definition: Let delta(v) be the total weight of the shortest path from s to v.

Lemma: delta(v) <= delta(u) + weight(u,v).
Proof is as straightforward as the previous one.
This is a kind of triangle inequality.

We're going to generalize the idea that BFS solves our problem if
every edge weight is 1.  Instead of pulling the next edge
out of a queue, we will maintain every vertex's shortest /known/
path from s and decrease these known values as appropriate.  We put
the known values in an array dist[] and call the decreasing "relaxation".
We also have a pred[] array for predecessors.

relax(u,v):
  if (dist[v] > dist[u] + weight(u,v)) {
    dist[v] = dist[u] + weight(u,v);
    pred[v] = u;
  }

Lemma: If we initialize dist[s] = 0 and every other dist[v] to weight(s,v)
if there is an edge from s to v, infinity otherwise, then we can do any
number of relaxations we want in any order and we will always have for
all v that dist[v] >= delta[v].

Proof: By induction on the number of relaxations.

Basis: delta[s] = 0 and delta[v] <= infinity.
Induction step: Suppose we relax(u,v).  If dist[v] <= dist[u] + weight(u,v)
then nothing happens so the claim is still true.  Otherwise we assign
dist[v] = dist[u] + weight(u,v).  But by I.H. on u,
dist[u] + weight(u,v) >= delta[u] + weight(u,v),
and this is >= delta[v] by the previous Lemma.

So we can solve the problem by doing relaxations until every dist[v]
equals delta[v].  By doing relaxations in the right order, the process
will converge very quickly.

Dijkstra's algorithm:

Dijkstra:
initialize dist[] array as described in last lemma;
put all vertices other than s on a priority queue prioritized by dist[] value;
while (priority queue is nonempty) {
  u = extract-min from the priority queue;
  for each edge (u,v)
    relax(u,v);
}

Note that the relax(u,v) in the last line may entail a decrement
in the priority queue.

Theorem: Dijkstra's algorithm is correct.  We must show that when a
vertex u has the smallest dist[] value of all those not already
extracted (i.e. when u is extracted), delta[u] = dist[u].  This is
sufficient since everything gets extracted eventually and dist[] values
never increase.

Proof: By induction on while loop iterations with inductive
hypothesis: all vertices v already extracted had delta[v] = dist[v]
when they were extracted.

Basis: Trivial.  At the beginning, none have been extracted.

Induction step: Suppose for contradiction we extract u but dist[u] >
delta[u].  Then the current path to u (as maintained by the pred[]
array) is not a shortest path.  In the shortest path, there must be an
edge x-->y where x has been extracted, y has not, and y != u.  (s is
extracted, u is not, so we must cross from extracted to not extracted
somewhere.  If y = u, then when x was extracted, delta[x] = dist[x]
and we called relax(x,u).  So that would have made dist[u] = delta[u]
since this is the shortest path, but we are assuming dist[u] > delta[u]).
Hence y was relaxed such that dist[y] = delta[y] when x was extracted
(the shortest path to y is this subpath by the first lemma).  So
     dist[y] = delta[y]     by preceding argument
             < delta[u]     by shortest path to u goes through y (*)
             <= dist[u]     by lemma
So we would have extracted y, not u, contradiction.

(*) This is where Dijkstra's algorithm correctness depends on positive
edge weights!

Complexity of Dijkstra's algorithm:

We have n extract-mins and m relaxations.

If we use a binary heap with keys being shortest path from s, we have
O(n) for the build-heap
+ O(n log n) for the delete-mins
+ O(m log n) for the decrease-keys
= O(m log n) in all

We can do better by using Fibonacci heaps.  They do delete-min in
O(log n) but decrease-key in O(1) (amortized).  This makes Dijkstra
O(m + n log n) which is an asymptotic improvement.

If the graph is dense (m on the order of n^2) we can do better with
something simpler: instead of a heap, just look through the dist[]
array every time and find the minimum:
O(n^2) delete-mins (n of them in O(n) each)
+ O(m) decrease-keys (m of them in O(1) each)
= O(n^2 + m)
= O(n^2) in all.

So the question is m log n vs. n^2.  Which is better depends on your graph.