Recitation 21: Shortest Paths

Floyd-Warshall details aren't here yet... we'll put this up soon.

So far this week, we've seen two important graph representations (adjacency list and adjacency matrix), two important graph traversals (breadth-first search and depth-first search), and a simple application of DFS (topological sorting). Today we're going to discuss the shortest path problem.

Single-Source Shortest Path on Unweighted Graphs

Suppose we have an undirected graph of a wide-area network. Vertices represent computers, and edges represent network links. The graph is unweighted; every link has the same latency. And we want to route a packet from some node s to some node t as quickly as possible.

This problem is asking for the shortest path between two given nodes. Since the graph is unweighted, we only need to find the path with the minimum number of edges. Do we know an algorithm for determining this? [BFS] What is the running time of that algorithm? [O(V+E)] Is there any faster algorithm for computing this? [No: In general we must traverse every edge and vertex.]

Observe that BFS finds the shortest path from s to each other node, not just node t.

Single-Source Shortest Path on Weighted Graphs

Now let's consider the more general problem of computing the shortest path between two nodes in a weighted graph. This is the more general shortest path problem. In our network example, we'll assign a weight to each edge equal to the latency of the corresponding network link. Fortunately latency always is nonnegative; notice that the minimum-weight path is not well-defined if the graph has a negative-weight cycle. Notice also that we cannot handle negative-weight edges simply by adding a constant weight to every edge, since that increases the weight of long paths more than short paths.

The shortest path problem is much more general than it first appears. Many computer science problems that don't appear to contain graphs can be reduced to the shortest path problem. For example, in speech recognition, you need to be able to distinguish between words that sound alike. A solution is to construct a graph with a vertex for each possible word and a directed edge between possible neighboring words. The weight of each edge indicates how often you would expect to find those words adjacent in normal text. The best interpretation of the sentence then is given by the shortest path across the graph.

How can we solve the shortest path problem?

The most important algorithm for finding the shortest path is Dijkstra's algorithm, which, like BFS, finds the shortest paths from a vertex s to every other vertex. The main idea is that after iteration i-1 of the algorithm, we have determined the shortest paths from u to each of a set S of the closest i-1 vertices in the graph. Then during iteration i we find the vertex vnext (not in S) that is closest to a vertex v in S.

Go through an example for a moderately-sized graph. Spend a minute to convince yourself that the shortest path to vnext must be path through v to vnext: What if it went through some other vertex in S instead? What if it went through some vertex not in S?

What is the running time of this algorithm as we've specified it so far? Assume the graph is represented with adjacency lists and we have an array of boolean values to indicate (in constant time) whether a vertex is in S. We clearly have |V| iterations, and during each one we must examine O(E) edges to find the shortest one between a vertex in S and a vertex not in S. Since |E| is potentially |V|2, we have an O(V3) algorithm. We'd like to do better.

How can we implement this efficiently? We'll always have O(V) iterations. The key difficulty appears to be finding the vertex vnext closest to a vertex v in S; we'll need to find a data structure that allows us to do this in less than O(E) time for each iteration.

We'll create an array dist to hold the minimum known distance to each vertex. On each iteration, we find the vertex nearest to S by scanning the array. Whenever we add a vertex to S, we'll update the minimum distance of every vertex adjacent to it; this is called relaxation. We can do this by running down the adjacency list for that vertex. Since we do this once for each vertex, we only spend a total of O(E) time updating distances. So we can implement our algorithm in O(V2+E)=O(V2) time. Here's some pseudocode for the algorithm:

initialize known[] so only s is known
initialize dist[] to infinities
set dist[v]=w(s,v) for each edge (s,v)
while unknown vertices remain
  select an unknown vertex vnext minimizing dist[vnext]
  set dist[x] = min(dist[x], dist[vnext]+w(vnext,x)) for each edge (vnext,x)
  set known[vnext]=true

Of course, if we're only interested in the shortest path from s to t then we can stop as soon as we mark t as belonging to S, but that doesn't change the asymptotic running time of the algorithm.