- Reading: Cameron 3.1–3.2, 3.4, MCS 19.1
definitions: random variable, PMF, joint PMF, sum/product/etc of RVs, indicator variable, expectation

**Definition:** A (real-valued) **random variable** \(X\) is just a function \(X : S → ℝ\).

**Example:** Suppose I roll a fair 6-sided die. On an even roll, I win $10. On an odd roll, I lose however much money is shown. We can model the experiment (rolling a die) using the sample space \(S = \{1,2,3,4,5,6\}\) and an equiprobable measure. The result of the experiment is given by the random variable \(X : S → \mathbb{R}\) given by \(X(1) ::= -1\), \(X(2) ::= 10\), \(X(3) ::= -3\), \(X(4) ::= 10\), \(X(5) ::= -5\), and \(X(6) ::= 10\).

**Definition:** Given a random variable \(X\) and a real number \(x\), the poorly-named event \((X = x)\) is defined by \((X = x) ::= \{k \in S \mid X(k) = x\}\).

This definition is useful because it allows to ask "what is the probability that \(X = x\)?"

**Definition:** The **probability mass function (PMF)** of \(X\) is the function \(PMF_X : \mathbb{R} → \mathbb{R}\) given by \(PMF_X(x) = Pr(X = x)\).

Given random variables \(X\) and \(Y\) on a sample space \(S\), we can combine apply any of the normal operations of real numbers on \(X\) and \(Y\) by performing them pointwise on the outputs of \(X\) and \(Y\). For example, we can define \(X + Y : S → \mathbb{R}\) by \((X+Y)(k) ::= X(k) + Y(k)\). Similarly, we can define \(X^2 : S → \mathbb{R}\) by \((X^2)(k) ::= \left(X(k)\right)^2\).

We can also consider a real number \(c\) as a random variable by defining \(C : S → \mathbb{R}\) by \(C(k) ::= c\). We will use the same variable for both the constant random variable and for the number itself; it should be clear from context which we are referring to.

We often want to count how many times something happens in an experiment.

**Example:** Suppose I flip a coin 100 times. The sample space would consist of sequences of 100 flips, and I might define the variable \(N\) to be the number of heads. For example, \(N(H,H,H,H,\dots,H) = 100\), while \(N(H,T,H,T,\dots) = 50\).

A useful tool for counting is an **indicator variable**:

**Definition:** The indicator variable for an event \(A\) is a variable having value 1 if the \(A\) happens, and 0 otherwise.

The number of times something happens can be written as a sum of indicator variables.

In the coin example, we could define an indicator variable \(I_1\) which is 1 if the first coin is a head, and 0 otherwise (e.g. \(I_1(H,H,H,\dots) = I_1(H,T,H,T,\dots) = 1\)). We could define a variable \(I_2\) that only looks at the second toss, and so on. Then \(N\) as defined above can be written as \(N = \sum I_i\). This is useful because (as we'll see when we talk about expectation) it is often easier to reason about a sum of simple variables (like \(I_i\)) than it is to reason about a complex variable like \(N\).

We can summarize the probability distribution of two random variables \(X\) and \(Y\) using a "joint PMF". The **joint PMF** of \(X\) and \(Y\) is a function from \(\mathbb{R} \times \mathbb{R} → \mathbb{R}\) and gives for any \(x\) and \(y\), the probability that \(X = x\) and \(Y = y\). It is often useful to draw a table:

\(Pr\) | y | ||
---|---|---|---|

1 | 10 | ||

x | 1 | 1/3 | 1/6 |

10 | 1/6 | 1/3 |

Note that the sum of the entries in the table must be one (**Exercise:** prove this). You can also check that summing the rows gives the PMF of \(Y\), while summing the columns gives the PMF of \(X\).

The "expected value" is an estimate of the "likely outcome" of a random variable. It is the weighted average of all of the possible values of the RV, weighted by the probability of seeing those outcomes. Formally:

**Definition:** The **expected value** of \(X\), written \(E(X)\) is given by \[E(X) ::= \sum_{k \in S} X(k)Pr(\{k\})\]

**Claim:** (alternate definition of \(E(X)\)) \[E(X) = \sum_{x \in \mathbb{R}} x\cdot Pr(X=x)\]

**Proof sketch:** this is just grouping together the terms in the original definition for the outcomes with the same \(X\) value.

**Note:** You may be concerned about "\(\sum_{x \in \mathbb{R}}\). In discrete examples, \(Pr(X = x) = 0\) almost everywhere, so this sum reduces to a finite or at least countable sum. In non-discrete example, this summation can be replaced by an integral. Measure theory is a branch of mathematics that puts this distinction on firmer theoretical footing by replacing both the summation and the integral with the so-called "Lebesgue integral". In this course, we will simply use "\(\sum\)" with the understanding that it becomes an integral when the random variable is continuous.