Lecture 8: Random variables

Reading: Cameron 3.1–3.2, 3.4, MCS 19.1
Last semester's notes
definitions: random variable, PMF, joint PMF, sum/product/etc of RVs, indicator variable, expectation

Random variables

Definition: A (real-valued) random variable $X$ is just a function $X : S → ℝ$.

Example: Suppose I roll a fair 6-sided die. On an even roll, I win $10. On an odd roll, I lose however much money is shown. We can model the experiment (rolling a die) using the sample space $S = \{1,2,3,4,5,6\}$ and an equiprobable measure. The result of the experiment is given by the random variable $X : S → \mathbb{R}$ given by $X(1) ::= -1$, $X(2) ::= 10$, $X(3) ::= -3$, $X(4) ::= 10$, $X(5) ::= -5$, and $X(6) ::= 10$.

Definition: Given a random variable $X$ and a real number $x$, the poorly-named event $(X = x)$ is defined by $(X = x) ::= \{k \in S \mid X(k) = x\}$.

This definition is useful because it allows to ask "what is the probability that $X = x$?"

Definition: The probability mass function (PMF) of $X$ is the function $PMF_X : \mathbb{R} → \mathbb{R}$ given by $PMF_X(x) = Pr(X = x)$.

Combining random variables

Given random variables $X$ and $Y$ on a sample space $S$, we can combine apply any of the normal operations of real numbers on $X$ and $Y$ by performing them pointwise on the outputs of $X$ and $Y$. For example, we can define $X + Y : S → \mathbb{R}$ by $(X+Y)(k) ::= X(k) + Y(k)$. Similarly, we can define $X^2 : S → \mathbb{R}$ by $(X^2)(k) ::= \left(X(k)\right)^2$.

We can also consider a real number $c$ as a random variable by defining $C : S → \mathbb{R}$ by $C(k) ::= c$. We will use the same variable for both the constant random variable and for the number itself; it should be clear from context which we are referring to.

Indicator variables

We often want to count how many times something happens in an experiment.

Example: Suppose I flip a coin 100 times. The sample space would consist of sequences of 100 flips, and I might define the variable $N$ to be the number of heads. For example, $N(H,H,H,H,\dots,H) = 100$, while $N(H,T,H,T,\dots) = 50$.

A useful tool for counting is an indicator variable:

Definition: The indicator variable for an event $A$ is a variable having value 1 if the $A$ happens, and 0 otherwise.

The number of times something happens can be written as a sum of indicator variables.

In the coin example, we could define an indicator variable $I_1$ which is 1 if the first coin is a head, and 0 otherwise (e.g. $I_1(H,H,H,\dots) = I_1(H,T,H,T,\dots) = 1$). We could define a variable $I_2$ that only looks at the second toss, and so on. Then $N$ as defined above can be written as $N = \sum I_i$. This is useful because (as we'll see when we talk about expectation) it is often easier to reason about a sum of simple variables (like $I_i$) than it is to reason about a complex variable like $N$.

Joint PMF of two random variables

We can summarize the probability distribution of two random variables $X$ and $Y$ using a "joint PMF". The joint PMF of $X$ and $Y$ is a function from $\mathbb{R} \times \mathbb{R} → \mathbb{R}$ and gives for any $x$ and $y$, the probability that $X = x$ and $Y = y$. It is often useful to draw a table:

$Pr$		y
$Pr$		1	10
x	1	1/3	1/6
x	10	1/6	1/3

Note that the sum of the entries in the table must be one (Exercise: prove this). You can also check that summing the rows gives the PMF of $Y$, while summing the columns gives the PMF of $X$.

Expectation

The "expected value" is an estimate of the "likely outcome" of a random variable. It is the weighted average of all of the possible values of the RV, weighted by the probability of seeing those outcomes. Formally:

Definition: The expected value of $X$, written $E(X)$ is given by \[E(X) ::= \sum_{k \in S} X(k)Pr(\{k\})\]

Claim: (alternate definition of $E(X)$) \[E(X) = \sum_{x \in \mathbb{R}} x\cdot Pr(X=x)\]

Proof sketch: this is just grouping together the terms in the original definition for the outcomes with the same $X$ value.

Note: You may be concerned about "$\sum_{x \in \mathbb{R}}$. In discrete examples, $Pr(X = x) = 0$ almost everywhere, so this sum reduces to a finite or at least countable sum. In non-discrete example, this summation can be replaced by an integral. Measure theory is a branch of mathematics that puts this distinction on firmer theoretical footing by replacing both the summation and the integral with the so-called "Lebesgue integral". In this course, we will simply use "$\sum$" with the understanding that it becomes an integral when the random variable is continuous.