8.1 Potential Outcomes

SW 13.1

A powerful tool for thinking about causal effects is counterfactual reasoning. For individuals that participate in the treatment, we observe what their outcome is given that they participated in the treatment. But we don’t observe what their outcome would have been if they had not participated in the treatment. For example, among those that went to college, we don’t observe what their earnings would have been if they had not gone to college. For those that don’t participate in the treatment, we face the opposite problem — we don’t observe what their outcome would have been if they had participated in the treatment.

We’ll relate causal inference to the problem of trying to figure out what outcomes individuals that participated in the treatment would have experienced if they had not participated in the treatment (at least on average) and/or figuring out what outcomes individuals that did not participated in the treatment would have experienced if they had participated in the treatment.

To do this, let’s introduce somewhat more formal notation/terminology.

Treated potential outcome: \(Y_i(1)\), the outcome an individual would experience if they participated in the treatment

Untreated potential outcome: \(Y_i(0)\), the outcome an individual would experience if they did not participate in the treatment

For individuals that participate in the treatment, we observe \(Y_i(1)\) (but not \(Y_i(0)\)). For individuals that do not participate in the treatment, we observe \(Y_i(0)\) (but not \(Y_i(1)\)). Another way to write this is that the observed outcome, \(Y_i\) is given by \[\begin{align*} Y_i = D_i Y_i(1) + (1-D_i) Y_i(0) \end{align*}\]

We can think about the individual-level effect of participating in the treatment: \[\begin{align*} TE_i = Y_i(1) - Y_i(0) \end{align*}\]

Considering the difference between treated and untreated potential outcomes is a very natural (and, I think, helpful) way to think about causality. The causal effect of the treatment is the difference between the outcome that an individual would experience if they participate in the treatment relative to what they would experience if they did not participate in the treatment.

This notation also makes it clear that we are allowing for treatment effect heterogenity — the effect of participating in the treatment can vary across different individuals.

That said, most researchers essentially give up on trying to figure out individual level treatment effects. It is not so much that these are not interesting, more it is just that these are very hard to figure out. Take, for example, going to college, and suppose we are interested in the causal effect of going to college on a person’s earnings. I went to college, so I know what my \(Y(1)\) is, but I don’t know what my \(Y(0)\) is — and, I’d even have a hard time coming with a good guess as to what it might be.