Topic 8 Causal Inference

For the remainder of the semester, we will talk about methods for understanding the causal effect of one variable on another.

Let’s start with an example. Suppose we were interested in understanding the causal effect of attending college on earnings. We know a lot about calculating averages, so let’s consider calculating average earnings of those who went to college and comparing that to the average earnings of those who didn’t go to college. Let’s use the data that we used all the way back in Chapter 2 to make this calculation.


col_earnings <- mean(subset(us_data, educ >= 16)$incwage)
non_col_earnings <- mean(subset(us_data, educ < 16)$incwage)

#>   col_earnings non_col_earnings     diff
#> 1     87306.33         40557.28 46749.05

This seems to be a huge difference. We are used to calling this difference a partial effect of going to college, but should we call it the causal effect or an average causal effect across individuals? Probably no. The main reason is that earnings of individuals who went to college may have been different from individuals that didn’t go to college even if no one went to college. Another way to say this is that there are a number of other things besides college that affect a person’s earnings (age, race, IQ, “ability”, “hardworking-ness”, luck, etc.). If these are also correlated with whether or not a person goes to college (you would think that at least some of these are), then that would make it hard to interpret our simple difference-in-means as a causal effect.

Before we move on, let me give some more examples of causal questions that may be of interest to economists:

  • Causal effects of economic policies

    • What was the causal effect of a country raising its interest rate on GDP or employment?

    • What was the causal effect of a change in the minimum wage on employment?

    • What was the causal effect of changing voter ID laws on the number of ballots cast?

  • Causal effects of individual/firm choices

    • What was the causal effect of a price increase on quantity demanded?

    • What was the causal effect of changing a product attribute on some outcome of interest (e.g., changing the font type on clicks on Google ads)?

    • What was the causal effect of a new cholesterol medication on cholesterol levels?

    • What was the causal effect of a job training program on wages?

You could easily come up with many others too. A large fraction of the questions that researchers in economics try to address are ultimately about sorting out these sorts of causal effects.

For terminology, we’ll refer to the variable that we are interested in understanding its causal effect (going to college in the earlier example) as the treatment. For simplicity, we will mostly focus on the case where the treatment is binary. We will use \(D_i\) to denote the treatment, so that \(D_i=1\) if individual \(i\) participates in the treatment and \(D_i=0\) if individual \(i\) does not participate in the treatment.

Example: SW 13.3