Lecture 17: Chebychev's inequality and the weak law of large numbers

Chebychev's inequality

Claim (Chebychev's inequality): For any random variable \(X\), \[Pr(|X - E(X)| \geq a) \leq \frac{Var(X)}{a^2}\]

Proof: Note that \(|X - E(X)| \geq a\) if and only if \((X - E(X))^2 \geq a^2\). Therefore \(Pr(|X - E(X)| \geq a) = Pr((X - E(X))^2 \geq a^2)\). Applying Markov's inequality to the variable \((X - E(X))^2\) gives

\[ \begin{aligned} Pr(|X - E(X)| \geq a) &= Pr((X - E(X))^2 \geq a^2) \\ &\leq \frac{E((X - E(X))^2)}{a^2} \\ &= \frac{Var(X)}{a^2} \end{aligned} \]

by definition.

Example: Last time we used Markov's inequality and the fact that the average height is 5.5 feet to show that if a door is 55 feet high, then we are guaranteed that at least 90% of people can fit through it.

If we also know that the standard deviation of height is \(σ = 0.2\) feet, we can use Chebychev's inequality to build a smaller door. Let \(X\) be the height random variable. \(Var(X) = σ^2 = 0.04\).

If \(x - E(X) \geq a\) then \(|x - E(X)| \geq a\). Therefore, the event \((X - E(X))\) is a subset of the event \((|X - E(X)| \geq a)\), and thus \(Pr(X - E(X) \geq a) \leq Pr(|X - E(X)| \geq a)\). This lets us apply Chebychev's inequality to conclude \(Pr(X - E(X) \geq a) \leq \frac{Var(X)}{a^2}\).

Solving for \(a\), we see that if \(a \geq .6\), then \(Pr(X -E(X) \geq a) \leq 0.10\). This in turn gives us \(Pr(X \lt a + E(X)) = Pr(X - E(X) \lt a) \geq 0.9\). Thus, if the door is at least \(6.1\) feet tall, then 90% of the people can fit through.

Weak law of large numbers

Suppose we wish to estimate the average value of the height of a population by sampling \(n\) people from the population and averaging their height. The weak law of large numbers says that this will give us a good estimate of the "real" average.

Formally, we can model this experiment by letting our outcomes be sequences of \(n\) people. We can define several random variables: \(X_1\) is the height of the first person sampled; \(X_2\) is the height of the second person sampled, \(X_3\) is the height of the third and so forth.

Since these are all measures of height, \(E(X_1) = E(X_2) = \cdots = E(X_n)\) (let's call this value \(\mu\)) and \(Var(X_1) = \cdots = Var(X_n)\) (let's call this value \(\sigma^2\)). The result of our sampled average is given by the random variable \((X_1 + X_2 + \cdots + X_n)/n\). The weak law of large numbers says that this variable is likely to be close to the real expected value:

Claim (weak law of large numbers): If \(X_1, X_2, \dots, X_n\) are independent random variables with the same expected value \(\mu\) and the same variance \(σ^2\), then \[Pr\left(\left|\frac{X_1 + X_2 + \cdots + X_n}{n} - μ\right| \geq a\right) \leq \frac{σ^2}{na^2}\]

Proof: By Chebychev's inequality, we have \[Pr\left(\left|\sum X_i/n - E(\sum X_i/n)\right| \geq a\right) \leq \frac{Var(\sum X_i/n)}{a^2}\]

Now, by linearity of the expectation, we have \[E(\sum X_i/n) = \sum E(X_i)/n = nμ/n = μ\]

As was shown in homework 5, \(Var(cX) = c^2Var(X)\), and we also know that if \(X\) and \(Y\) are independent, that \(Var(X + Y) = Var(X) + Var(Y)\). Therefore, we have \[Var(\sum X_i/n) = \sum Var(X_i)/n^2 = nσ^2/n^2 = σ^2/n\]

Plugging these into the result from Chebychev's, we have \[Pr\left(\left|\sum X_i/n - μ\right| \geq a\right) \leq \frac{σ}{na^2}\]

which is what we were trying to show.