We defined probability spaces, proved a few simple results, and introduced proof by induction.

A **probability space** is a set *S* and a function *P*: 2^{S}→R satisfying the following properties:

- for all
*E*⊆*S*,*P*(*E*) ≥ 0 *P*(*S*) = 1- If
*E*_{1}and*E*_{2}are disjoint (that is, if*E*_{1}∩*E*_{2}= ∅, then*P*(*E*_{1}∪*E*_{2}) =*P*(*E*_{1}) +*P*(*E*_{2}).

*S* is called the **sample space**. The elements of *S* are called **outcomes**. The subsets of *S* are called **events** (so that an event is a collection of outcomes). If *E* is an event, *P*(*E*) is called the **probability** of *E*. The three properties listed above are called **Kolmogorov's Axioms**.

In this section, we assume that (*S*, *P*) is a probability space.

**Claim:** the probability that "nothing happens" is 0. That is, *P*(∅) = 0.

**Proof sketch:** Write *S* = *S* ∪ ∅. Apply axiom 3 (don't forget to check for disjointness!).

**Claim:** if *E* is an event, then the probability that *E* "doesn't happen" (that is, *P*(*S* \ *E*)) is 1 − *P*(*E*).

**Proof sketch:** Write *S* = *E* ∪ (*S* \ *E*). Apply axiom 3.

Consider the following claims and proofs:

**Claim 2:** If *E*_{1} and *E*_{2} are mutually disjoint, then *P*(*E*_{1} ∪ *E*_{2}) = *P*(*E*_{1}) + *P*(*E*_{2})

**Claim 3:** If *E*_{1}, *E*_{2}, and *E*_{3} are mutually disjoint, then *P*(*E*_{1} ∪ *E*_{2} ∪ *E*_{3}) = *P*(*E*_{1}) + *P*(*E*_{2}) + *P*(*E*_{3})

**Claim 4:** If *E*_{1}, *E*_{2}, *E*_{3}, and *E*_{4} are mutually disjoint, then *P*(*E*_{1} ∪ *E*_{2} ∪ *E*_{3} ∪ *E*_{4}) = *P*(*E*_{1}) + *P*(*E*_{2}) + *P*(*E*_{3}) + *P*(*E*_{4})

**Claim 5:** If *E*_{1}, ..., *E*_{5} are mutually disjoint, then *P*(*E*_{1} ∪ ⋯ ∪ *E*_{5}) = *P*(*E*_{1}) + ⋯ + *P*(*E*_{5})

**Proof 2:** This is just axiom 3.

**Proof 3:** We can add parentheses to see that *E*_{1} ∪ *E*_{2} ∪ *E*_{3} = (*E*_{1} ∪ *E*_{2}) ∪ *E*_{3}. Since *E*_{3} is disjoint from *E*_{1} and *E*_{2}, it must be disjoint from their union. Thus we can apply axiom 3 to conclude *P*(*E*_{1} ∪ *E*_{2} ∪ *E*_{3}) = *P*(*E*_{1} ∪ *E*_{2}) + *P*(*E*_{3})

By claim 2, *P*(*E*_{1} ∪ *E*_{2}) = *P*(*E*_{1}) + *P*(*E*_{2}), so we see that the right hand side is just *P*(*E*_{1}) + *P*(*E*_{2}) + *P*(*E*_{3}), as required.

**Proof 4:** We can add parentheses to see that *E*_{1} ∪ *E*_{2} ∪ *E*_{3} ∪ *E*_{4} = (*E*_{1} ∪ *E*_{2} ∪ *E*_{3}) ∪ *E*_{4}. Since *E*_{4} is disjoint from *E*_{1}, *E*_{2}, and *E*_{4}, it must be disjoint from their union. Thus we can apply axiom 3 to conclude *P*(*E*_{1} ∪ *E*_{2} ∪ *E*_{3} ∪ *E*_{4}) = *P*(*E*_{1} ∪ *E*_{2} ∪ *E*_{3}) + *P*(*E*_{4})

By claim 3, *P*(*E*_{1} ∪ *E*_{2} ∪ *E*_{3}) = *P*(*E*_{1}) + *P*(*E*_{2}) + *P*(*E*_{3}), so we see that the right hand side is just *P*(*E*_{1}) + *P*(*E*_{2}) + *P*(*E*_{3}) + *P*(*E*_{4}), as required.

**Proof 5:** We can add parentheses to see that *E*_{1} ∪ ⋯ ∪ *E*_{5} = (*E*_{1} ∪ ⋯ ∪ *E*_{4}) ∪ *E*_{5}. Since *E*_{5} is disjoint from each of *E*_{1} ... *E*_{4}, it must be disjoint from their union. Thus we can apply axiom 3 to conclude *P*(*E*_{1} ∪ ⋯ ∪ *E*_{5}) = *P*(*E*_{1} ∪ ⋯ ∪ *E*_{4}) + *P*(*E*_{5})

By claim 4, *P*(*E*_{1} ∪ ⋯ ∪ *E*_{4}) = *P*(*E*_{1}) + ⋯ + *P*(*E*_{4}), so we see that the right hand side is just *P*(*E*_{1}) + ⋯ + *P*(*E*_{5}), as required.

You have probably noticed that the proofs of claims 3, 4, and 5 are almost identical. You have probably concluded that for any *n*, you could copy and modify one of these proofs to produce a proof of claim *n*. The proof of claim *n* would rely on claim *n* − 1, but that's not a problem because by the time you get around to proving claim *n* you will already have proven claim *n* − 1.

In fact, your proof of claim *n* might look like this:

**Claim n:** If *E*_{1}, ..., *E*_{n} are mutually disjoint, then *P*(*E*_{1} ∪ ⋯ ∪ *E*_{n}) = *P*(*E*_{1}) + ⋯ + *P*(*E*_{n})

**Proof n:** Assume that claim n-1 holds. We can add parentheses to see that *E*_{1} ∪ ⋯ ∪ *E*_{n} = (*E*_{1} ∪ ⋯ ∪ *E*_{n − 1}) ∪ *E*_{n}. Since *E*_{n} is disjoint from each of *E*_{1} ... *E*_{n − 1}, it must be disjoint from their union. Thus we can apply axiom 3 to conclude *P*(*E*_{1} ∪ ⋯ ∪ *E*_{n}) = *P*(*E*_{1} ∪ ⋯ ∪ *E*_{n − 1}) + *P*(*E*_{n})

By claim n-1, *P*(*E*_{1} ∪ ⋯ ∪ *E*_{n − 1}) = *P*(*E*_{1}) + ⋯ + *P*(*E*_{n − 1}), so we see that the right hand side is just *P*(*E*_{1}) + ⋯ + *P*(*E*_{n}), as required.

This is the core idea behind the technique of **Proof by induction**. The principle of induction says that if you want to prove a statement that says "for all *n* ≥ *n*_{0}, *P*(*n*) holds", then you need only prove two things:

*P*(*n*_{0}) holds, and- for an arbitrary
*n*,*if**P*(*n*− 1) is assumed*then**P*(*n*) holds.

The first of these two proofs is often referred to as that **base case**; the second of the two proofs is often called the **inductive step**. The assumption that *P*(*n* − 1) holds is often called the **inductive hypothesis**.

Here is a complete proof by induction of the example fact discussed above.

**Claim:** For any *n* ≥ 2, if *E*_{1}, ... *E*_{n} are all mutually disjoint, then *P*(*E*_{1} ∪ ⋯ ∪ *E*_{n}) = *P*(*E*_{1}) + ⋯ + *P*(*E*_{n}).

**Proof:** We will prove this claim by induction on *n*. In the base case (when *n* = 2), this statement is the same as axiom 3, and is thus true since *S*, *P* is a probability space.

For the inductive step, choose an arbitrary *n* and assume the inductive hypothesis: that whenever *E*_{1}, ..., *E*_{n − 1} are mutually disjoint, that *P*(*E*_{1} ∪ ⋯ ∪ *E*_{n − 1}) = *P*(*E*_{1}) + ⋯ + *P*(*E*_{n − 1}).

We wish to show that for the chosen *n*, that if *E*_{1}, ..., *E*_{n} are mutually disjoint, that *P*(*E*_{1} ∪ ⋯ ∪ *E*_{n}) = *P*(*E*_{1}) + ⋯ + *P*(*E*_{n}).

We can add parentheses to see that *E*_{1} ∪ ⋯ ∪ *E*_{n} = (*E*_{1} ∪ ⋯ ∪ *E*_{n − 1}) ∪ *E*_{n}. Since *E*_{n} is disjoint from each of *E*_{1} ... *E*_{n − 1}, it must be disjoint from their union. Thus we can apply axiom 3 to conclude *P*(*E*_{1} ∪ ⋯ ∪ *E*_{n}) = *P*(*E*_{1} ∪ ⋯ ∪ *E*_{n − 1}) + *P*(*E*_{n})

. By the inductive hypothesis, *P*(*E*_{1} ∪ ⋯ ∪ *E*_{n − 1}) = *P*(*E*_{1}) + ⋯ + *P*(*E*_{n − 1})

so we see that the right hand side is just *P*(*E*_{1}) + ⋯ + *P*(*E*_{n}), as required.