Graphs

Reading: CLR sec. 5.4 pp. 86-90, sec. 23.1 pp. 465-469.

G = (V,E)
V = vertices
E = edges

directed graphs: edges are ordered pairs of vertices
if edge (u,v) in E, draw u--->v
undirected graphs: edges are unordered pairs of vertices
draw u---v

path: sequence of vertices u_0,u_1,...,u_n
such that (u_i,u_i+1) in E, 0 <= i < n
length of path: number of edges
-length of u_0,u_1,...,u_n is n
-a single vertex is a path of length 0 from that vertex to itself
simple path: no repeated vertices
cycle: path u_0,u_1,...,u_n of nonzero length such that u_0=u_n
simple cycle: no repeated vertices except u_0=u_n
dag = directed acyclic graph (directed graph with no cycles)
tree = dag such that exactly one vertex has indegree 0
and every other vertex has indegree 1
forest = dag such that every vertex has indegree <= 1
(i.e., a bunch of trees)

Complexity bounds usually given in terms
of n and m, where n is the number of vertices
and m is the number of edges.  E.g., linear
time is O(n+m).  CLR uses V for n and E for m.
Note m is O(n^2).

2 main data structures used for graphs:
-adjacency list
-adjacency matrix

Adjacency Lists

CLR version:
-array of singly linked lists, one list for each vertex
-the list for vertex u contains v if (u,v) is an edge
space usage: O(n+m)
insert an edge: O(1)
delete edge (u,v): O(outdegree of u) [to search u's list]
test whether (u,v) is an edge: O(outdegree of u)
find all edges out of u: O(outdegree of u)
find all edges into u: O(m)

The last can be improved to O(indegree of u) by maintaining
2 lists for each vertex u:
-outgoing list: contains all vertices v such that (u,v) is an edge
-incoming list: contains all vertices v such that (v,u) is an edge.
Each edge (u,v) is represented exactly twice, once on the outgoing
list of u and once on the incoming list of v, and these two list
elements are linked.  The outgoing and incoming lists of each
vertex are doubly linked for easy deletion.

Adjacency Matrix

n x n boolean matrix indexed by vertices
has 1 in position u,v iff (u,v) is an edge, 0 otherwise
space usage: O(n^2) always
insert an edge: O(1)
delete an edge: O(1)
test whether (u,v) is an edge: O(1)
find all edges out of u: O(n) [search across u-th row]
find all edges into u: O(n) [search down u-th column]

Thus adjacency lists are better when m is o(n^2),
i.e. if the graph is sparse.  Note trees are sparse.

Strongly Connected Components

A directed graph is /strongly connected/ if every pair
of vertices u,v lie on a common cycle; i.e., there is
a path from u to v and a path from v to u.

Even if a graph is not strongly connected, can define
an equivalence relation == by: u == v iff u and v lie
on a common cycle.  This is an equivalence relation, i.e.
it is reflexive, symmetric, and transitive, so it partitions
the set of vertices into equivalence classes called the
/strongly connected components/ (or just /strong components/).

Thus a graph is strongly connected iff it has exactly one
strong component.

Given a graph G = (V,E), define a new graph
G' = (V',E') obtained by collapsing the strong
components into single vertices.  Formally,

V' = {strongly connected components of G}
E' = {(A,B) | there exist u in A and v in B such that (u,v) in E}

Theorem: G' is a dag.