CS410, Summer 1998 Lecture 22 Outline Dan Grossman Goals: Intro. to Graphs CLR Chapter 5.4, 23 Lots and lots of definitions: * A graph is a set of vertices (nodes) and edges. We usually take the vertices to be the numbers 0, 1, ... n-1. The edges are pairs of vertices. We write G=(V,E) where E is a subset of V x V. * Directed and undirected -- in an undirected graph, if (u,v) is an edge then (v,u) is an edge. In a directed graph, this isn't necessarily true. In a directed graph the edge (u,v) means there is an edge _from_ u _to_ v. * u is adjacent to v if (u,v) is an edge or (v,u) is an edge. * The in-degree of a node is the number of edges going to it. The out-degree is the number of edges coming out of it. For directed graphs, the degree is the in-degree plus the out-degree. For undirected graphs, in-degree, out-degree, and degree are all the same thing. * A path from u to v is a sequence of vertices v_1, v_2, ..., v_(n-1) such that (u, v_1), (v_1, v_2), ..., (v_(n-1), v) are all edges. If such a path exists then v is reachable from u. If no vertex is repeated in the sequence, then the path is simple. The length of the path is the number of vertices in the sequence. Claim: If v is reachable from u, then there is a simple path from u to v. * A cycle is a path from u to u. If the path without the second u is a simple path, then the cycle is a simple cycle. (That is, a simple cycle has no cycles "inside of it".) A graph with no cycles is acyclic. A graph with cycles is cyclic. A directed acyclic graph is called a dag. * The connected components of an undirected graphs are the sets of vertices such that: * every vertex in the set is reachable from every other * no superset of the set has this reachability property. For directed graphs, these are called the strongly connected componenets. A graph with one (strongly) connected component is called connected. * In a weighted graph, every edge also has a real number assigned to it. Sometimes we sill stipulate that the numbers are non-negative, but not always. * A forest is a collection of trees. * A directed graph is bipartite if no node has both incoming and outgoing edges. We can compare many of the data structures we have learned. Let i -> j mean "if something is an i then it is a j": bipartite graphs \ V lists -> trees -> connected dags -> dags -> graphs -> weighted graphs \ _____________/^ V / forests Graphs can be used to represent all sorts of things. Here's a small list: * street maps * computer network connections * which functions call which other functions in a program * which objects act on which others in a physical simulation * mathematical relations * the diagram above (expressing "all is are js") is a dag! * much, much more Unfortunately, computer memory is not made out of nodes and edges, so we need representations for graphs: * Adjacency Lists Keep an array of lists. At index i, store a linked list of all vertices j for which there exists an edge from i to j. (For undirected graphs, put the edge in both lists.) * Adjacency Matrix Keep a 2-dimensional array. At row i column j put a 1 if (i,j) is an edge else put a 0. For an undirected graph, half the 2-D array is redundant and need not be stored. Let n be the number of vertices and m the number of edges. Clearly, m <= n^2. Also let d_i be the out-degree of vertex i. Let's compare the representations: Adjacency List Adjacency Matrix Space O(n+m) O(n^2) Time for is(i,j) an edge? O(d_i) O(1) Time for all edges out of i O(d_i) O(n) Insert an edge O(1) O(1) Generally an adjacency matrix takes more space unless the graph is dense (most possible edges are there) in which case it takes less because each entry is one bit rather than a linked list node. The trade-off between the second and third line above must be resolved per application. If necessary, we could maintain both representations and get the best-of-both-worlds time bounds in exchange for the increased space. The bottom line is that for our algorithms we will assume we have whichever representation is more convenient to our present purpose. It is straightforward to convert from one representation to the other in time O(n^2). Question: In a graph (unweighted), what is the (shortest) distance from node s to every other node and what is some path with that distance. Answer: Use breadth-first search (BFS) mark each node unvisited (generally use a separate array) initialize dist of each node to infinity // will hold shortest distance initialize pred of each node to null // will point to node before it in q = new Queue(); // some shortest path visited[s] = true; dist[s] = 0; q.enqueue(s); while (!s.isempty()) t = q.dequeue(); for all edges out of t and into v if (!visited[v]) visited[v] = true; dist[v] = dist[t] + 1; pred[v] = t; q.enqueue(v); We worked through an example in class. Next time we will examine the running time and prove that the algorithm is correct.