CS410, Summer 1998
Lecture 23 Outline
Dan Grossman

Goals:
   * BFS Correctness
   * DFS

Recall our bfs distance algorithm.  Today for simplicity, let's forget
about the actual path:

mark each node unvisited (generally use a separate array)
initialize dist of each node to infinity // will hold shortest distance
q = new Queue();                     // some shortest path

visited[s] = true;
dist[s] = 0;
q.enqueue(s);

while (!s.isempty())
  t = q.dequeue();
  for all edges out of t and into v
        if (!visited[v])
           visited[v] = true;
           dist[v] = dist[t] + 1;
           q.enqueue(v);

Running time: O(n+m) with adjacency list rep, where n is |V| and m is
|E|.  Proof: body of for is constant time.  Each node is enqueued at
most once.  Hence each edge is followed at most once. (O(n) for init
arrays.  Often assume m > n anyway.)

As for correctness: need there is a path of that length and there is
none shorter (or there is no path and it's infinity).

Let delta(s,v) be the shortest-path distance.

Lemma: For any edge (u,v), delta(s,v) <= delta(s,u) + 1

Proof easy

Lemma 1: delta(s,v) <= dist[v]

Proof by induction on number of times while loop executed (at most
|V|) with invariant (IH) that dist[v] >= delta(s,v).

Base: infinity >= delta(s,v)
      delta(s,s) = 0
Inductive: 
      when v is marked visited, it is true for t and v.
      Then by previous lemma it is still true for v.
      This is the only time its dist is changed.

Lemma 2: Elements in queue differ in dist by at most one and go from
lower to higher.

Proof by induction on number of while loop executions.

Base: only one thing in queue

Inductive: True after dequeue easily.
	   True after enqueue by using IH and adding dist of dist[t]+1

Lemma 3 (about graphs):
  delta(s,v) = k < infinity
             iff
  either v = s and k = 0
  or there exists a node t such that delta(s,t) = k-1 and (t,v) in E.

Proof by induction on k.  It's trivial.

Theorem: dist[v] = delta(s,v)

Proof: 
  Unreachable nodes are trivial (follows from Lemma 1 too).

  For reachable, IH hypothesis is that all nodes with delta of k are
	* enqueued exactly once, before any nodes with larger delta
	* have their dist set to delta when they are enqueued

  base: k=0 reached first, dist set correctly
  ind: Consider v of (k+1).  By Lemma 2, if v is enqueued, it is after
       all nodes of delta = k.  By Lemma 3, there is a node adjacent with
       distance k.  By IH this node is enqueued and correct dist.  It must
       be dequeued.  Hence delta v is enqueued with dist at most k+1.  By Lemma
       1 we get equality.

Bottom Line: There's nothing magical going on here.  You just have to
think carefully about invariants when you write non-trivial
algorithms.

Note: edges used form a bread-first spanning tree of the connected
component.  So all edges in tree go to next level.  All other edges go
to same level.  (Else not bfs).

Similarly, if we pick some unmarked vertex and continue BFS, then we
end up with a BFS forest.

Depth-First Search

Visit children before siblings (whereas bfs was siblings before
children).  We will not get dist this way, but we will get arrive and
leave (d and f in your book) which will be useful in ways we will see.

initialize all to unvisited
time = 0

visit(s)

visit (t) :
   arrive[t] = time++
   for each edge out of t and into v:
	if v unvisited
	   visit(v)
   leave[t] = time++

Builds a DFS tree.  For forest, go to next unvisited.

O(n+m) because each edge only done once.

Parenthesis property: For any u,v vertices, either
   arrive[u] < arrive[v] < leave[v] < leave[u]
or 
   arrive[v] < arrive[u] < leave[u] < leave[v]
or
   arrive[u] < leave[u] < arrive[v] < leave[v]
or
   arrive[v] < leave[v] < arrive[u] < leave[u]

For any forest, we can put every directed edge in the graph in one of
* four categories: tree edge: used in the forest back edge: to
* ancestor in same tree (including self-loops) forward edge:
* descendant in same tree cross edge: all others: same tree but not
* ancestor/descendant or different
              tree

[Note: I completely screwed up in class and did not realize that we
are really talking about DIRECTED graphs here!]

Theorem: For a DFS forest of an directed graph, there are no cross edges.