CS410, Summer 1998 Lecture 23 Outline Dan Grossman Goals: * BFS Correctness * DFS Recall our bfs distance algorithm. Today for simplicity, let's forget about the actual path: mark each node unvisited (generally use a separate array) initialize dist of each node to infinity // will hold shortest distance q = new Queue(); // some shortest path visited[s] = true; dist[s] = 0; q.enqueue(s); while (!s.isempty()) t = q.dequeue(); for all edges out of t and into v if (!visited[v]) visited[v] = true; dist[v] = dist[t] + 1; q.enqueue(v); Running time: O(n+m) with adjacency list rep, where n is |V| and m is |E|. Proof: body of for is constant time. Each node is enqueued at most once. Hence each edge is followed at most once. (O(n) for init arrays. Often assume m > n anyway.) As for correctness: need there is a path of that length and there is none shorter (or there is no path and it's infinity). Let delta(s,v) be the shortest-path distance. Lemma: For any edge (u,v), delta(s,v) <= delta(s,u) + 1 Proof easy Lemma 1: delta(s,v) <= dist[v] Proof by induction on number of times while loop executed (at most |V|) with invariant (IH) that dist[v] >= delta(s,v). Base: infinity >= delta(s,v) delta(s,s) = 0 Inductive: when v is marked visited, it is true for t and v. Then by previous lemma it is still true for v. This is the only time its dist is changed. Lemma 2: Elements in queue differ in dist by at most one and go from lower to higher. Proof by induction on number of while loop executions. Base: only one thing in queue Inductive: True after dequeue easily. True after enqueue by using IH and adding dist of dist[t]+1 Lemma 3 (about graphs): delta(s,v) = k < infinity iff either v = s and k = 0 or there exists a node t such that delta(s,t) = k-1 and (t,v) in E. Proof by induction on k. It's trivial. Theorem: dist[v] = delta(s,v) Proof: Unreachable nodes are trivial (follows from Lemma 1 too). For reachable, IH hypothesis is that all nodes with delta of k are * enqueued exactly once, before any nodes with larger delta * have their dist set to delta when they are enqueued base: k=0 reached first, dist set correctly ind: Consider v of (k+1). By Lemma 2, if v is enqueued, it is after all nodes of delta = k. By Lemma 3, there is a node adjacent with distance k. By IH this node is enqueued and correct dist. It must be dequeued. Hence delta v is enqueued with dist at most k+1. By Lemma 1 we get equality. Bottom Line: There's nothing magical going on here. You just have to think carefully about invariants when you write non-trivial algorithms. Note: edges used form a bread-first spanning tree of the connected component. So all edges in tree go to next level. All other edges go to same level. (Else not bfs). Similarly, if we pick some unmarked vertex and continue BFS, then we end up with a BFS forest. Depth-First Search Visit children before siblings (whereas bfs was siblings before children). We will not get dist this way, but we will get arrive and leave (d and f in your book) which will be useful in ways we will see. initialize all to unvisited time = 0 visit(s) visit (t) : arrive[t] = time++ for each edge out of t and into v: if v unvisited visit(v) leave[t] = time++ Builds a DFS tree. For forest, go to next unvisited. O(n+m) because each edge only done once. Parenthesis property: For any u,v vertices, either arrive[u] < arrive[v] < leave[v] < leave[u] or arrive[v] < arrive[u] < leave[u] < leave[v] or arrive[u] < leave[u] < arrive[v] < leave[v] or arrive[v] < leave[v] < arrive[u] < leave[u] For any forest, we can put every directed edge in the graph in one of * four categories: tree edge: used in the forest back edge: to * ancestor in same tree (including self-loops) forward edge: * descendant in same tree cross edge: all others: same tree but not * ancestor/descendant or different tree [Note: I completely screwed up in class and did not realize that we are really talking about DIRECTED graphs here!] Theorem: For a DFS forest of an directed graph, there are no cross edges.