Intro to probability

We defined probability spaces, proved a few simple results, and introduced proof by induction.

Probability space

A probability space is a set S and a function P: 2S→R satisfying the following properties:

  1. for all E ⊆ S, P(E) ≥ 0
  2. P(S) = 1
  3. If E1 and E2 are disjoint (that is, if E1 ∩ E2 = ∅, then P(E1 ∪ E2) = P(E1) + P(E2).

S is called the sample space. The elements of S are called outcomes. The subsets of S are called events (so that an event is a collection of outcomes). If E is an event, P(E) is called the probability of E. The three properties listed above are called Kolmogorov's Axioms.

Simple facts about probability spaces

In this section, we assume that (S, P) is a probability space.

Claim: the probability that "nothing happens" is 0. That is, P(∅) = 0.

Proof sketch: Write S = S ∪ ∅. Apply axiom 3 (don't forget to check for disjointness!).

Claim: if E is an event, then the probability that E "doesn't happen" (that is, P(S \ E)) is 1 − P(E).

Proof sketch: Write S = E ∪ (S \ E). Apply axiom 3.

Induction

Consider the following claims and proofs:

Claim 2: If E1 and E2 are mutually disjoint, then
P(E1 ∪ E2) = P(E1) + P(E2)

Claim 3: If E1, E2, and E3 are mutually disjoint, then
P(E1 ∪ E2 ∪ E3) = P(E1) + P(E2) + P(E3)

Claim 4: If E1, E2, E3, and E4 are mutually disjoint, then
P(E1 ∪ E2 ∪ E3 ∪ E4) = P(E1) + P(E2) + P(E3) + P(E4)

Claim 5: If E1, ..., E5 are mutually disjoint, then
P(E1 ∪ ⋯ ∪ E5) = P(E1) + ⋯ + P(E5)

Proof 2: This is just axiom 3.

Proof 3: We can add parentheses to see that E1 ∪ E2 ∪ E3 = (E1 ∪ E2) ∪ E3. Since E3 is disjoint from E1 and E2, it must be disjoint from their union. Thus we can apply axiom 3 to conclude
P(E1 ∪ E2 ∪ E3) = P(E1 ∪ E2) + P(E3)
By claim 2, P(E1 ∪ E2) = P(E1) + P(E2), so we see that the right hand side is just P(E1) + P(E2) + P(E3), as required.

Proof 4: We can add parentheses to see that E1 ∪ E2 ∪ E3 ∪ E4 = (E1 ∪ E2 ∪ E3) ∪ E4. Since E4 is disjoint from E1, E2, and E4, it must be disjoint from their union. Thus we can apply axiom 3 to conclude
P(E1 ∪ E2 ∪ E3 ∪ E4) = P(E1 ∪ E2 ∪ E3) + P(E4)
By claim 3, P(E1 ∪ E2 ∪ E3) = P(E1) + P(E2) + P(E3), so we see that the right hand side is just P(E1) + P(E2) + P(E3) + P(E4), as required.

Proof 5: We can add parentheses to see that E1 ∪ ⋯ ∪ E5 = (E1 ∪ ⋯ ∪ E4) ∪ E5. Since E5 is disjoint from each of E1 ... E4, it must be disjoint from their union. Thus we can apply axiom 3 to conclude
P(E1 ∪ ⋯ ∪ E5) = P(E1 ∪ ⋯ ∪ E4) + P(E5)
By claim 4, P(E1 ∪ ⋯ ∪ E4) = P(E1) + ⋯ + P(E4), so we see that the right hand side is just P(E1) + ⋯ + P(E5), as required.

You have probably noticed that the proofs of claims 3, 4, and 5 are almost identical. You have probably concluded that for any n, you could copy and modify one of these proofs to produce a proof of claim n. The proof of claim n would rely on claim n − 1, but that's not a problem because by the time you get around to proving claim n you will already have proven claim n − 1.

In fact, your proof of claim n might look like this:

Claim n: If E1, ..., En are mutually disjoint, then
P(E1 ∪ ⋯ ∪ En) = P(E1) + ⋯ + P(En)

Proof n: Assume that claim n-1 holds. We can add parentheses to see that E1 ∪ ⋯ ∪ En = (E1 ∪ ⋯ ∪ En − 1) ∪ En. Since En is disjoint from each of E1 ... En − 1, it must be disjoint from their union. Thus we can apply axiom 3 to conclude
P(E1 ∪ ⋯ ∪ En) = P(E1 ∪ ⋯ ∪ En − 1) + P(En)

By claim n-1, P(E1 ∪ ⋯ ∪ En − 1) = P(E1) + ⋯ + P(En − 1), so we see that the right hand side is just P(E1) + ⋯ + P(En), as required.

This is the core idea behind the technique of Proof by induction. The principle of induction says that if you want to prove a statement that says "for all n ≥ n0, P(n) holds", then you need only prove two things:

  1. P(n0) holds, and
  2. for an arbitrary n, if P(n − 1) is assumed then P(n) holds.

The first of these two proofs is often referred to as that base case; the second of the two proofs is often called the inductive step. The assumption that P(n − 1) holds is often called the inductive hypothesis.

Here is a complete proof by induction of the example fact discussed above.

Claim: For any n ≥ 2, if E1, ... En are all mutually disjoint, then P(E1 ∪ ⋯ ∪ En) = P(E1) + ⋯ + P(En).

Proof: We will prove this claim by induction on n. In the base case (when n = 2), this statement is the same as axiom 3, and is thus true since S, P is a probability space.

For the inductive step, choose an arbitrary n and assume the inductive hypothesis: that whenever E1, ..., En − 1 are mutually disjoint, that P(E1 ∪ ⋯ ∪ En − 1) = P(E1) + ⋯ + P(En − 1).

We wish to show that for the chosen n, that if E1, ..., En are mutually disjoint, that P(E1 ∪ ⋯ ∪ En) = P(E1) + ⋯ + P(En).

We can add parentheses to see that E1 ∪ ⋯ ∪ En = (E1 ∪ ⋯ ∪ En − 1) ∪ En. Since En is disjoint from each of E1 ... En − 1, it must be disjoint from their union. Thus we can apply axiom 3 to conclude
P(E1 ∪ ⋯ ∪ En) = P(E1 ∪ ⋯ ∪ En − 1) + P(En)
. By the inductive hypothesis,
P(E1 ∪ ⋯ ∪ En − 1) = P(E1) + ⋯ + P(En − 1)
so we see that the right hand side is just P(E1) + ⋯ + P(En), as required.