3.11 Conditional Expectations

SW 2.3

It is useful to know about joint pmfs/pdfs/cdfs, but they are often hard to work with in practice. For example, if you have two random variables, visualizing their joint distribution would involve interpreting a 3D plot which is often challenging in practice. If you had more than two random variables, then fully visualizing their joint distribution would not be possible. Therefore, we will typically look at summaries of the joint distribution. Probably the most useful one is the conditional expectation that we study in this section; in fact, we will spend much of the semester trying to estimate conditional expectations.

For two random variables, \(Y\) and \(X\), the conditional expectation of \(Y\) given \(X=x\) is the mean value of \(Y\) conditional on \(X\) taking the particular value \(x\). In math, this is written

\[ \mathbb{E}[Y|X=x] \]

One useful way to think of a conditional expectation is as a function of \(x\). For example, suppose that \(Y\) is a person’s yearly income and \(X\) is a person’s years of education. Clearly, mean income can change for different values of education.

Conditional expectations will be a main focus of ours throughout the semester

An extremely useful property of conditional expectations is that they generalize from the case with two variables to the case with multiple variables. For example, suppose that we have four random variables \(Y\), \(X_1\), \(X_2\), and \(X_3\). It makes sense to think about \[ \mathbb{E}[Y|X_1=x_1, X_2=x_2, X_3=x_3] \] which is the expected value of \(Y\) conditional on \(X_1\) taking the particular value \(x_1\), \(X_2\) taking the particular value \(x_2\), and \(X_3\) taking the particular value \(x_3\). Just like before, you can think of this as a function, in the sense that changing any of \(x_1\), \(x_2\), and/or \(x_3\) gives a new conditional mean.