CS312 Lecture 14: Graphs

Administrivia

Prelim tonight, 7:30-9:00, B17 Upson: Bring a pen/pencil, come a bit early (10 minutes).
PS#4 will be out after the prelim.
Hint: be sure you are comfortable with tail recursion.
Look at the following definition of factorial
```
fun fact (n:int):int =
  if (n = 1) then n else n*fact(n-1)
```
If we try to evaluate the value of fact(10) we will have
```
eval(fact (10))
Þ
eval(10 * eval(fact (9)))
Þ
eval(10 * eval(9 * eval(fact (8))))
...
```
As you may notice, the evaluation need to have the result of the recursive call in order to give a final result. This happens because we manipulate the result of the recursive call multiplying by n (n*fact(n-1)). This type of recursion is very elegant, but not efficient when we evaluate it with our substitution model. Now if we write the same function with tail recursion it look as follows:
```
fun fact (n:int):int =
  let fun iter (ans:int,m:int) =
    if (m = 1) then ans else iter(m*ans,m-1)
  in
    iter (1, n)
  end
```
Now the evaluation of fact(10) will be
```
eval(fact (10))
Þ
eval(iter (1,10))
Þ
eval(iter (10*1,9))
Þ
eval(iter (9*10,8))
...
```
As in this case we don't manipulate the result of the recursive call, the evaluation is much more simple and fast.

Graphs

Graphs are extremely powerful and useful datastructures. Examples: networks (roads, internet sites, web pages); social relations; more obscure ones (speech, vision).

A graph has nodes (vertices) and links (edges). We will use the terms interchangeably.

A graph can be directed or undirected. In a directed graph, an edge goes from one vertex to another; in an undirected graph, it simply connects the two vertices (ordered pair vs. unordered pair). You can turn an undirected graph into a directed one by doubling the number of edges. A particularly interesting type of graph is a DAG (directed acyclic graph). A special case of a DAG where each vertex is pointed to by at most 1 edge is called a tree.

Many graphs are weighted, which means in general that on each edge there is a non-negative integer. Weight can be distance (obvious road network) or cost (toll roads) or time (fastest route to get somewhere), or anything.

Graph algorithms

There are numerous interesting algorithmic questions that one can ask about a graph. Here are a few fun ones:

Can the graph be drawn in the plane without the edges crossing? How can you tell this fast? (Tarjan & Hopcroft)
Can the graph be colored with a limited number of colors, such that two adjacent nodes do not have the same colors?
Consider a map. Make the regions into vertices, connect them if they are neighbors. How many different colors are needed? Do four colors suffice?
Consider two graphs: are they isomorphic? I.e., can we rename the vertices in one to be those of the other?
What is the shortest path from one point to another (mapquest)? Or between each pair of points?
Suppose that the weights are capacity constraints (pipelines, or roads with congestion). How much "stuff"(widget) can we send from one node (factory) to another (store)? What is the optimal route? Note that we need to assume conservation of stuff except at source and sink.

For many of these questions, we need some basic algorithms for dealing with graphs. The most important graph algorithms involve traversals.

Traversals

The most basic graph operation is to traverse a graph. The key idea in graph traversal is to mark the vertices as we visit them and to keep track of what we have not yet explored. We will consider each vertex to be in one of the states undiscovered, discovered, or completely explored. We'll start with only a single discovered vertex, and we'll consider each incident edge. If an edge connects to an undiscovered vertex, we'll mark the vertex as discovered and add it to our set of vertices to process. After we've looked at all the edges of a vertex, we'll grab another vertex from the set of discovered vertices. The order in which we explore the vertices depends on how we maintain the collection of discovered vertices.

If we use a queue (FIFO), we explore the oldest unexplored vertices first. This is known as a breadth-first search, or BFS. Each vertex (except the starting vertex) is discovered during the processing of one other vertex, so this defines the breadth-first search tree. Notice that the shortest path from the root to any other vertex is a path in this tree. And for undirected graphs, nontree edges can point only to vertices on the same level as the parent vertex or to vertices on the level directly below the parent. Why?

If we use a stack (LIFO) to store the discovered vertices instead, we venture along a single path away from the starting vertex until there are no more undiscovered vertices in front of us. This is known as a depth-first search. Just as with breadth-first search, we can define a depth-first search tree that results from our traversal. In a depth-first search of an undirected graph, every edge is either a tree edge or a back edge; there are no forward edges or cross edges. Why?

Of course, such a traversal will only traverse a connected component; if the graph has multiple components, we need to be careful to do a DFS (or BFS) of each component. Finding connected components is another important problem. Example: OCR.

CS312 Lecture 14: Graphs

Administrivia

Graphs

Graph algorithms

Traversals

CS312 © 2002 Cornell University Computer Science