Lecture 6: independence

Example using Bayes's rule and law of total probability

Suppose we are given a test for a condition. Let \(A\) be the event that a patient has the condition, and let \(B\) be the event that the test comes back positive.

The probability that a patient has the condition is \(Pr(A) = 1/10000\). The test has a false positive rate of \(Pr(B | \bar{A}) = 1/100\) (a false positive is when the test says "yes" despite the fact that the patient does not have the disease), and a false negative rate of \(Pr(\bar{B} | A) = 5/100\).

Suppose a patient tests positive. What is the probability that they have the disease? In other words, what is \(Pr(A|B)\)?

Bayes's rule tells us \(Pr(A|B) = \frac{Pr(B|A)Pr(A)}{Pr(B)}\). We can find \(Pr(B|A)\) using the fact from last lecture: \(Pr(B|A) = 1 - Pr(\bar{B}|A) = 95/100\). \(Pr(A)\) is given. We can use the law of total probability to find \(Pr(B)\); \(Pr(B) = Pr(B|A)Pr(A) + Pr(B|\bar{A})Pr(\bar{A})\).

Plugging everything in, we have

\[ \begin{aligned} Pr(A|B) &= \frac{Pr(B|A)Pr(A)}{Pr(B|A)Pr(A) + Pr(B|\bar{A})Pr(\bar{A})} \\ &= \frac{(95/100)(1/10000)}{(95/100)(1/10000) + (1/100)(9999/10000)} \\ &= \frac{95}{95+9999} \approx 1/100 \\ \end{aligned} \]

This is a surprising result: we take a test that fails \(\lt 5\)% of the time, and it says we have the disease, yet we have only about a 1% chance of having the disease.

However, note that our chances have grown from \(0.0001\) to \(0.01\), so we did learn quite a bit from the test.


Continuing the above example, we might want to try a second test to gain more information about whether we have the disease. However, it might be the case that the test fails deterministically: that getting a positive first test guarantees that you get a positive second test. In that case, the second test tells you nothing new.

What you would like is to perform an "independent" test: you would like that knowing the results of the first test tell you nothing about the results of the second test. This suggests the following defintion:

Definition 1: We say that events \(A\) and \(B\) are independent if \(Pr(A|B) = Pr(A)\).

Informally, if I tell you that \(B\) happens, it doesn't change your estimate of how likely \(A\) is.

There is another definition that has some advantages over definition 1:

Definition 2: We say that events \(A\) and \(B\) are independent if \(Pr(A \cap B) = Pr(A)Pr(B)\).

I've given two definitions, we'd better check that the meaning is unambiguous:

Claim: \(A\) and \(B\) are independent according to definition 1 if and only if \(A\) and \(B\) are independent according to definition 2.

Proof: We must prove that sets satisfying definition 1 also satisfy definition 2, and vice-versa.

Suppose that \(A\) and \(B\) satisfy definition 1; that is, assume \(Pr(A|B) = Pr(A)\). Then \(Pr(A \cap B)/Pr(B) = Pr(A)\). Multiplying both sides by \(Pr(A)\) yields \(Pr(A \cap B) = Pr(A)Pr(B)\), as desired.

Conversely, suppose \(A\) and \(B\) satisfy definition 2, so that \(Pr(A \cap B) = Pr(A)Pr(B)\). Then [details left as an exercise] and therefore \(Pr(A|B) = Pr(A)\).

Note: the definitions are actually slightly different, because \(Pr(A|B)\) might not be defined.

There are advantages to both definitions. The conditional probability definition arguably makes the intuition clearer: knowing that \(B\) happened doesn't tell you anything about whether \(A\) happened. The product definition makes it very clear that independence is symmetric (\(A\) and \(B\) are independent if and only if \(B\) and \(A\) are independent). It is also sometimes more useful in computations or proofs.

IMPORTANT: Don't assume things are independent unless you have a good reason to!