Lecture 33: Euler's theorem

Modular exponentiation

The exponention function \(\mathbb{Z}_m \times \mathbb{Z}_m → \mathbb{Z}_m\) given by \([a]^[b] ::= [a^b]\) is not well defined. For example, if \(m = 5\), we can check that \([2^3] = [8] = [3]\) but \([2^8] = [256] = [1] \neq [3]\), even though \([3] = [8]\).

We can raise an equivalence class to an integer power; we know multiplication is well-defined, and we define \([a]^n ::= \underbrace{[a][a]\cdots[a]}_{n\text{ times}}\).

Euler's theorem

Claim (Euler's theorem v1): Raising an unit mod \(m\) to the power of an equivalence class mod \(φ(m)\) is well defined. In other words, if \([a]_m\) is a unit and \([a]_m = [a']_m\) and \([b]_{φ(m)} = [b']_{φ(m)}\) then \([a^b]_m = [a'^{b'}]_m\).

This is equivalent to the following statement:

Claim (Euler's theorem v2): If \([a]\) is a unit mod \(m\), then \([a]_m^{φ(m)} = [1]_m\).

Proof of v1 from v2: It suffices to show that if \([b]_{φ(m)} = [b']_{φ(m)}\) then \([a^b]_m = [a^{b'}]_m\) (review exercise: why? Hint: \([a]^n\) is well defined).

Suppose \([a]_m = [a']_m\) and \([b]_{φ(m)} = [b']_{φ(m)}\). Since \([b]_{φ(m)} = [b']_{φ(m)}\), we have that \(b' = b + kφ(m)\) for some \(k\). Then

\[ \begin{aligned}{} [a^{b'}]_m &= [a^{b + kφ(m)}]_m && \text{by assumption} \\ &= [a^b \cdot (a^k)^{φ(m)}] = [a]^b[a^k]^{φ(m)} && \text{by algebra and definition of $[a]^n$} \\ &= [a]^b[1] = [a]^b && \text{by version 2 of Euler's theorem} \\ \end{aligned} \]

Which is what we were trying to prove. One can also check that v1 implies v2 by noting that \([φ(m)]_{φ(m)} = [0]_{φ(m)}\).

Proof of v2

Summary: We wish to show that \([a]^{\phi(m)} = [1]\). We will draw a picture of what happens when you multiply all the units by \([a]\). We'll find that the picture always forms \(n\) loops of the same size (\(ℓ\)), so that there are \(nℓ\) elements (i.e. \(\phi(m) = nℓ\)). To find \([a]^x\), you start at \([1]\) and multiply by \([a]\) \(x\) times, which traverses \(x\) arcs. If \(x = \phi(m)\) then this goes around a loop exactly \(n\) times, ending on \([1]\).

Working through an example (with fantastic ascii art)

Let's consider \(m = 9\). Then (using the fact that \([a]\) is a unit if \(a\) and \(m\) share no common factors, which we prove below):

\((Z_m)^* = \{[1], [2], [4], [5], [7], [8]\}\), \(\phi(9) = 6\).

We draw an arc from \(x\) to \(y\) if \([a]x = y\). For example, if \(a = 2\), then we have

 [1] --> [2] --> [4] --> [8] --> [7] --> [5]
  |                                       |                multiplication by [2]
  |                                       |              ------------------------>
  \-------------------<<<-----------------/

(note that \([8][2] = [16] = [7]\) and \([7][2] = [14] = [5]\) mod \(9\)). We get a nice loops that goes through all of the units. To compute \([2]^6\), we start at \([1]\) and multiply by \([2]\) 6 times, i.e. we start at \([1]\) and follow 6 edges. This brings us back to \([1]\), so \([2]^6 = [1]\).

We get a different picture if \(a = 4\):

[1] --> [4] --> [7]    [2] --> [8] --> [5]
 |               |      |               |                  multiplication by [4]
 \-----<<<-------/      \------<<<------/                ------------------------>

Here we have two loops, both of size 3. To compute \([4]^6\), we start at \([1]\) and follow \(6\) edges. This goes around the first loop twice, ending up at \([1]\). Thus \([4]^6 = [1]\).

We get a third picture if \(a = 1\):

  [1]    [2]    [4]    [5]    [7]    [8]
 /   \  /   \  /   \  /   \  /   \  /   \                  multiplication by [1]
 \-<-/  \-<-/  \-<-/  \-<-/  \-<-/  \-<-/                ------------------------>

Here we have 6 loops, each of size 1. We compute \([1]^6\) by starting at \([1]\) and following \(6\) edges. This loops \(6\) times around the loop of size \(1\), ending at \([1]\), so \([1]^6 = [1]\).

Generalizing

In general, the picture will always break up into \(k\) loops, all of the same length. There are several things you may need to check to convince yourself of this. They all have very simple (one or two line) proofs, which are all left as review exercises.

Thus all of the loops are the same size (call it \(ℓ\)), and every element is in exactly one of the loops. So the loops split up \((Z_m)^*\) into \(n\) groups of \(ℓ\) elements each, so \(Z_m^*\) has \(nℓ\) elements. In other words, \(\phi(m) = nℓ\).

Now, to compute \([a]^{\phi(m)}\), we start at \([1]\) and multiply by \([a]\) \(\phi(m)\) times. In other words, we traverse \(nℓ\) arcs. This means we go around the loop containing \([1]\) exactly \(n\) times, so we end up back at \([1]\). Thus \([a]^{\phi(m)} = [1]\).

Computing \(φ(m)\)

To determine \(φ(m)\), we rely on the following fact:

Claim: \([a]_m\) is a unit if and only if \(gcd(a,m) = 1\). This is true if and only if \(a\) and \(m\) have no common factors.

Proof: If \(gcd(a,m) = 1\), we can use Bezout's identity (proved on the homework) to write \(1 = sa + tm\). Reducing this equation mod \(m\), we have \[[1] = [sa + tm] = [s][a] + [t][m] = [s][a] + [t][0] = [s][a]\] This shows that \([a]\) has an inverse, namely \([s]\).

The other direction is left as a review exercise.

We can use this to compute \(φ(m)\) easily in some special cases. For example, if \(p\) is prime, then no number between \(1\) and \(p\) has a factor in common with \(p\). That means all elements of \(\mathbb{Z}_p\) are units, except \([0]\). Therefore, \(\mathbb{Z}_p\) has \((p-1)\) units, so \(φ(p) = p-1\).

If \(m\) is a product of two different primes \(p\) and \(q\), then there are \(pq\) numbers between \(0\) and \(m\). Of these, there are \(p\) multiples of \(q\) (namely \(0, p, 2p, \dots, (q-1)p\)) and \(q\) multiples of \(p\). The only number that is both a multiple of \(p\) and of \(q\) is \(0\). So we take the \(pq\) numbers, subtract off the \(p\) multiples of \(q\) and the \(q\) multiples of \(p\), and add one because we subtracted \(0\) twice. This gives \(φ(pq) = pq - p - q + 1 = (p-1)(q-1)\).