Lecture 15: properties of random variables

Combining random variables

Given random variables \(X\) and \(Y\) on a sample space \(S\), we can combine apply any of the normal operations of real numbers on \(X\) and \(Y\) by performing them pointwise on the outputs of \(X\) and \(Y\). For example, we can define \(X + Y : S → \mathbb{R}\) by \((X+Y)(k) ::= X(k) + Y(k)\). Similarly, we can define \(X^2 : S → \mathbb{R}\) by \((X^2)(k) ::= \left(X(k)\right)^2\).

We can also consider a real number \(c\) as a random variable by defining \(C : S → \mathbb{R}\) by \(C(k) ::= c\). We will use the same variable for both the constant random variable and for the number itself; it should be clear from context which we are referring to.

Linearity of expectation

Claim: If \(X\) and \(Y\) are random variables, then \(E(X+Y) = E(X) + E(Y)\)

Proof: \[ \begin{aligned} E(X+Y) &= \sum_{k \in S} (X+Y)(k)Pr(\{k\}) && \text{by definition of $E$} \\ &= \sum_{k \in S} (X(k) + Y(k))Pr(\{k\}) && \text{by definition of $X+Y$} \\ &= \left(\sum_{k \in S} X(k)Pr(\{k\})\right) + \left(\sum_{k \in S} Y(k)Pr(\{k\})\right) && \text{algebra} \\ &= E(X) + E(Y) && \text{definition of $E$} \end{aligned} \]

Claim: If \(c \in \mathbb{R}\) and \(X : S → \mathbb{R}\) then \(E(cX) = cE(X)\).

Note: we are using the number \(c\) as a random variable as discussed above.

Proof: \[E(cX) = \sum_{k \in S} (cX)(k)Pr(\{k\}) = \sum_{k \in S} c(X(k))Pr(\{k\}) = c\sum_{k \in S} X(k)Pr(\{k\}) = cE(X)\]

These two properties (\(E(X + Y) = E(X) + E(Y)\) and \(E(cX) = cE(X)\)) are summarized by saying "expectation is linear".

Joint PMF of two random variables

We can summarize the probability distribution of two random variables \(X\) and \(Y\) using a "joint PMF". The joint PMF of \(X\) and \(Y\) is a function from \(\mathbb{R} \times \mathbb{R} → \mathbb{R}\) and gives for any \(x\) and \(y\), the probability that \(X = x\) and \(Y = y\). It is often useful to draw a table:

\(Pr\) y
1 10
x 1 1/3 1/6
10 1/6 1/3

Note that the sum of the entries in the table must be one (Exercise: prove this). You can also check that summing the rows gives the PMF of \(Y\), while summing the columns gives the PMF of \(X\).

Independence of random variables

Recall that events \(A\) and \(B\) are independent if \(Pr(A\cap B) = Pr(A)Pr(B)\). We say that random variables \(X\) and \(Y\) are independent if for all \(x, y \in \mathbb{R}\), the events \((X = x)\) and \((Y = y)\) are independent.

Example: Variables \(X\) and \(Y\) with the joint PMF given in the above table are not independent. For example, \(Pr(X = 1 \cap Y = 1) = 1/3 \neq Pr(X = 1)Pr(Y = 1) = (1/2) \cdot (1/2)\).

Informally, you can think of independence as indicating that knowing the value of one of the variables does not give any information about the value of the other. For example, height and weight are not independent, because knowing that someone is taller increases the likelyhood that they are heavier.

Example: variables \(X\) and \(Y\) with the following joint PMF are independent:
\(Pr\) y
1 10
x 1 1/4 1/4
10 1/4 1/4

Given any \(x\) and \(y\), \(Pr(X = x \cap Y = y) = 1/4\). Moreover, \(Pr(X = x) = 1/4 + 1/4 = 1/2\) and \(Pr(Y = y) = 1/4 + 1/4 = 1/2\). Therefore \(Pr(X = x)Pr(Y = y) = 1/4 = Pr(X = x \cap Y = y)\).

Expectation of product

Unlike sums, the expectation of \(XY\) is not in general equal to \(E(X)E(Y)\). However, if \(X\) and \(Y\) are independent, they are equal.

Claim: If \(X\) and \(Y\) are independent random variables, then \(E(XY) = E(X)E(Y)\).

Proof: We will use the alternative definition of expectation given in the last lecture because it makes the proof a bit easier. In lecture, I started by simplifying the left-hand side to show it is equal to the RHS; here I simplify the right-hand side.

\[ \begin{aligned} E(X)E(Y) &= \left(\sum_{x \in ℝ} x Pr(X = x)\right)\left(\sum_{y \in ℝ} y Pr(Y = y)\right) && \text{by alternative definition of $E$} \\ &= \sum_{x \in ℝ} \sum_{y \in ℝ} xy Pr(X = x) Pr(Y = y) && \text{algebra} \\ &= \sum_{x \in ℝ} \sum_{y \in ℝ} xy Pr(X = x \cap Y = y) && \text{independence of $X$ and $Y$} \\ \end{aligned} \]

We can group together the terms of this sum that have the same product \(xy\). For example, suppose the joint PMF of \(x\) and \(y\) was the following:

\(Pr(X=x\cap Y=y)\) y
1 2 3 4
x 1 \(\cdots\) \(\cdots\) \(\cdots\) 2/5
2 \(\cdots\) 1/8 \(\cdots\) \(\cdots\)
3 \(\cdots\) \(\cdots\) \(\cdots\) \(\cdots\)
4 1/4 \(\cdots\) \(\cdots\) \(\cdots\)

There would be three terms in the sum with \(xy = 4\); the \(x = 1, y = 4\) term, the \(x = 2, y = 2\) term, and the \(x = 4, y = 1\) term:

\[ \begin{aligned} \sum_{x \in ℝ} \sum_{y \in ℝ} xy Pr(X = x \cap Y = y) = &\cdots + 1 \cdot 4 \cdot Pr(X = 1 \cap Y = 4) + {} \\ &\cdots + 2 \cdot 2 \cdot Pr(X = 2 \cap Y = 2) + {} \\ &\cdots + 4 \cdot 1 \cdot Pr(X = 4 \cap Y = 1) + \cdots \\ = &\cdots + 1 \cdot 4 \cdot (2/5) + 2 \cdot 2 \cdot (1/8) + 4 \cdot 1 \cdot (1/4) + \cdots \\ = &\cdots + 4 \cdot \left(\frac{2}{5} + \frac{1}{8} + \frac{1}{4}\right) + \cdots \\ = &\cdots + 4 \cdot Pr(XY = 4) \end{aligned} \]

We can combine these into a single term: \(4 \cdot Pr(XY = 4)\). By grouping all of the terms with the same product, we convert the above expression to

\[ \begin{aligned} E(X)E(Y) &= \sum_{x \in ℝ} \sum_{y \in ℝ} xy Pr(X = x \cap Y = y) && \text{from above} \\ &= \sum_{a \in ℝ} a Pr(XY = a) && \text{combining terms as just described} \\ &= E(XY) && \text{alternative definition of $E$} \end{aligned} \]

which is what we were trying to prove.

Motivation and bad definition of variance

Another useful property of a random variable is its variance. The variance of \(X\) is a measure of how "spread out" the distribution is. If one selects an outcome \(k\) at random and computes the distance from \(X(k)\) to \(E(X)\), you are likely to get a large number if the distribution is very spread out, and a small number if the distribution is not very spread out.

This suggests the following definition (which is wrong):

\[Var(X) \stackrel{?}{=} E(X - E(X))\]

Next lecture, we will repair this definition.