Modern Approaches to Difference in Differences

class: center, middle, inverse, title-slide

# Modern Approaches to Difference in Differences
### Brantly Callaway, University of Georgia
### October 22, 2021 <br><br>Session 3: Pre-Testing, Event Studies

---

# Pre-Testing

`$$\newcommand{\E}{\mathbb{E}}$$`
`$$\newcommand{\P}{\mathrm{P}}$$`

border-top: 80px solid #BA0C2F;

.inverse {
  background-color: #BA0C2F;
}

.alert {
    font-weight:bold; 
    color: red;
}

.alert-blue {
    font-weight: bold;
    color: blue;
}

.remark-slide-content {
    font-size: 23px;
    padding: 1em 4em 1em 4em;
}

.highlight-red {
 background-color:red;
 padding:0.1em 0.2em;
}
</style>

Perhaps one of the main attractions of DID (or, more generally, panel data approaches to causal inference) is that the main identifying assumptions can be pre-tested

- That is, one can "validate" the identification strategy by 
implementing it in pre-treatment periods and making sure that it does not spuriously deliver non-zero effects of the policy

More precisely, if the policy is implemented in period `$t^*$`, if parallel trends holds in all periods, then it should be the case that

`$$\E[\Delta Y_{t^*-1} | D=1] - \E[\Delta Y_{t^*-1}| D=0]$$`
--

If there is variation in treatment timing, this should hold across all groups in their pre-treatment periods.

---

# Event Studies

Event studies are often used to <span class="alert-blue">study dynamic effects of the treatment</span> and/or <span class="alert">"pre-test" the parallel trends assumption</span>

They are often implemented with the following event study regression:

`$$Y_{it} = \theta_t + \eta_i + \sum_{e=-(\mathcal{T}-1)}^{-2} D_{it}^e \beta_e + \sum_{e=0}^{\mathcal{T}-2} D_{it}^e \beta_e + v_{it}$$`

where `$D_{it}^e$` is a dummy variable that is equal to `$1$` in period `$t$` if individual `$i$` has been treated for exactly `$e$` periods.  For example,

- `$D_{it}^0$` equals 1 for units who become treated in period `$t$`, and 0 otherwise.

- `$D_{it}^2$` equals 1 for units who become treated in period `$t-2$`, and 0 otherwise

- `$D_{it}^{-1}$` equals 1 for units who become treated in period `$t+1$`, and 0 otherwise

---

# Event study regressions

Abraham and Sun (2021) study this sort of event study regression, and note a number of similar issues as we pointed out earlier:

- `$\beta_e$` includes terms that involve treatment effects at different lengths of exposure

Two particularly problematic cases:

- Trying to estimate effects after all units become treated

- Heterogeneous effects across groups at the same length of exposure to the treatment

---

# ATT(g,t) version of event study

It is very easy to report an event study using `$ATT(g,t)$`'s.

`$$ATT^{ES}(e) = \sum_{g \in \mathcal{G}_e} ATT(g,g+e) \P(G=g|G \in \mathcal{G}_e)$$`

where `$\mathcal{G}_e = \{ g \in \mathcal{G} : g + e \leq \mathcal{T}\}$`. (i.e., the set of groups that we observe to particpate in the treatment for at least `$e$` periods)

---

# Example: Minimum Wage

```r
library(fixest)
library(did)
library(ggplot2)
load("mw_data2.RData")

# create event time variable
mw_data2$e <- ifelse(mw_data2$treat==1, 
                  mw_data2$year - mw_data2$first.treat,
                  0)

# run event study regression
es_reg <- feols(lemp ~ i(e, ref=-1) | year + countyreal,
                data=mw_data2,
                cluster="countyreal")
```

---

# Example: Minimum Wage

```r
# plot event study regression
iplot(es_reg, ylim.add=c(-.5,.5))
```

---

# Example: Minimum Wage

```r
# callaway and sant'anna
cs_res <- att_gt(yname="lemp",
                 tname="year",
                 idname="countyreal",
                 gname="first.treat",
                 data=mw_data2)
cs_es <- aggte(cs_res, type="dynamic")
```

---

# Example: Minimum Wage

```r
ggdid(cs_es, ylim=c(-.25,.2))
```

---

# Example: Simulated Data

Recall: parallel trends holds in simulated data

```r
# same data as before
data <-readRDS("sim_data.RDS")

# run event study regression
es_reg <- feols(Y ~ i(e, ref=-1) | id + time.period,
                data=data,
                cluster="id")
```

---

# Example: Simulated Data

```r
# plot event study regression
iplot(es_reg)
```

---

# Example: Simulated Data

```r
# callaway and sant'anna
cs_res <- att_gt(yname="Y",
                 tname="time.period",
                 idname="id",
                 gname="G",
                 data=data,
                 control_group="notyettreated")

# aggregate into event study (dynamic effects)
cs_es <- aggte(cs_res, type="dynamic") 
```

---

# Example: Simulated Data

```r
ggdid(cs_es)
```

---

# Example: Simulated Data

CS does much better than event study regression here.  The main reason is that there is no never treated group here.

Next, we'll talk about some extensions that (I think) are helpful in applications.