Lecture 7: Equivalence classes

reading: MCS 10.10
define equivalence classes
talk about well-defined functions on equivalence classes

Drawing binary relations

We can draw a binary relation \(A\) on \(R\) as a graph, with a vertex for each element of \(A\) and an arrow for each pair in \(R\).

For example, the following diagram represents the relation \(\{(a,b), (b,e), (b,f), (c,d), (g,h), (h,g), (g,g)\}\):

diagram of R (click for LaTeX source)

Using these diagrams, we can describe the three equivalence relation properties visually:

reflexive (\(∀ x, x R x\)): every node should have a self-loop. The above relation is not reflexive, because (for example) there is no edge from \(a\) to \(a\).
symmetric (\(∀ x, y\) if \(xRy\) then \(yRx\)): every edge should have a reverse edge as well. The above relation is not symmetric, because (for example) there is an edge from \(b\) to \(f\) but not from \(f\) to \(b\).
transitive (\(∀ x,y,z\), if \(xRy\) and \(yRz\) then \(xRz\)): if there is a path from \(x\) to \(z\) then there should be an edge directly from \(x\) to \(z\). The above relation is not transitive, because (for example) there is an path from \(a\) to \(f\) but no edge from \(a\) to \(f\).

If we have a relation that we know is an equivalence relation, we can leave out the directions of the arrows (since we know it is symmetric, all the arrows go both directions), and the self loops (since we know it is reflexive, so there is a self loop on every vertex).

Closure

If we have a relation \(R\) that doesn't satisfy a property \(P\) (such as reflexivity or symmetry), we can add edges until it does. This is called the \(P\) closure of \(R\). Formally:

Definition: the if \(P\) is a property of relations, \(P\) closure of \(R\) is the smallest relation containing \(R\) that satisfies property \(P\).

For example, to take the reflexive closure of the above relation, we need to add self loops to every vertex (this makes it reflexive) and nothing else (this makes it the smallest reflexive relation). Here is a picture of the reflexive closure:

reflexive closure of R (click for LaTeX source)

Similarly for the transitive and symmetric closures:

symmetric closure of R (click for LaTeX source)

transitive closure of R (click for LaTeX source)

We can take the reflexive symmetric transitive closure of \(R\) to turn \(R\) into an equivalence relation:

reflexive symmetric transitive closure of R (click for LaTeX source)

Equivalence classes

Definitions

Definition: If \(R\) is an equivalence relation on \(A\) and \(x \in A\), then the equivalence class of \(x\), denoted \([x]_R\), is the set of all elements of \(A\) that are related to \(x\), i.e. \([x]_R = \{y \in A \mid x R y\}\). If \(R\) is clear from context, we leave it out.

In the example above, \([a] = [b] = [e] = [f] = \{a,b,e,f\}\), while \([c] = [d] = \{c,d\}\) and \([g] = [h] = \{g,h\}\). The equivalence classes are easy to see in the diagram:

equivalence classes (click for LaTeX source)

Definition: The set of all equivalence classes of \(A\) is denoted \(A / R\) (pronounced "\(A\) modulo \(R\)" or "\(A\) mod \(R\)"). Notationally, \(A/R = \{[x] \mid x \in A\}\).

In the example above, \(A/R = \{[a], [c], [g]\}\).

Definition: If \(c \in A/R\) and \(x \in c\), then \(x\) is called a representative of \(c\).

In the example above, \(a\) is a representative of \([b]\), and \(d\) is a representative of \(\{c,d\}\).

Examples

Equivalence classes let us think of groups of related objects as objects in themselves. For example

if \(A\) is the set of people, and \(R\) is the "is a relative of" relation, then \(A/R\) is the set of families
if \(A\) is the set of hash tables, and \(R\) is the "has the same entries as" relation, then \(A/R\) is the set of functions with a finite domain.

Foreshadowing

We'll see equivalence classes in several places in the remainder of the course:

(cardinality) if \(A\) is the set of all sets (this is actually problematic, but we'll pretend it makes sense), and \(R\) is the "has a same cardinality as" relation, then \(A/R\) is the set of cardinal numbers; which is how you would define \(|X|\).
(combinatorics) we'll see later that if \(A\) is the set of sequences of length \(n\), and \(R\) is the "can be rearranged to" relation, then \(A/R\) is the set of subsets of size \(n\).
(automata) we'll take \(A\) to be the set of states of a machine, and \(R\) to be the "behaves the same as" relation, and then \(A/R\) will be the states of an optimized machine.
(number theory) if \(A\) is the set of integers, and \(R\) is the "has the same remainder when divided by \(n\) as" relation, then \(A/R\) will be the modular numbers.
(graphs) if \(A\) is the set of vertices, and \(R\) is the "is reachable from" relation, then \(A/R\) is the set of connected components of the graph.

Properties

Claim: if \(R\) is an equivalence relation on \(A\), then the equivalence classes of \(R\) form a partition of \(A\). That is, every element of \(x\) is in some equivalence class, and no two different equivalence classes overlap.

Proof sketch: (you could fill in the details as an exercise)

first part: every \(x\) is in \([x]\) because \(R\) is reflexive.
second part: if \([a]\) and \([b]\) overlap, then there is some \(c\) in the intersection. Then we can use symmetry and transitivity to show that every element of \([a]\) is related to \(d\), and thus to \(b\), and is thus in \([b]\); likewise, every element of \([b]\) is in \([a]\), so \([a]\) and \([b]\) are the same.

Functions

We ended lecture with the following question. Let \(A\) be a set of people, and let \(R\) be the "is related to" relation (where everyone is assumed to be related to themselves).

Suppose I wrote down the following rule:

Let \(f : A/R → A\) be defined by letting \(f([a])\) be \(a\)'s oldest living relative

Is \(f\) a function?

To see why it might or might not be, compare it with the following rule:

Let \(g : A/R → \mathbb{N}\) be defined by letting \(g([a])\) be \(a\)'s age.

\(f\) is a function, but not obviously so. \(g\) is not a function. To see why, suppose that \(a\)'s age is 13 and \(b\)'s age is 25. Then \(g([a]) = 13\) and \(g([b]) = 25\). But \([a] = [b]\), so we have a single input giving multiple outputs, depending on how we write it down.

\(f\) is a function, because if \([a] = [b]\) and if \(a\)'s oldest living relative is \(c\), then \(b\)'s oldest living relative must also by \(c\). So choosing different representatives of the input leads to the same value; the function is well-defined.