class: center, middle, inverse, title-slide # Modern Approaches to Difference in Differences ### Brantly Callaway, University of Georgia ### October 22, 2021
Session 3: Pre-Testing, Event Studies --- # Pre-Testing `$$\newcommand{\E}{\mathbb{E}}$$` `$$\newcommand{\P}{\mathrm{P}}$$` <style type="text/css"> border-top: 80px solid #BA0C2F; .inverse { background-color: #BA0C2F; } .alert { font-weight:bold; color: red; } .alert-blue { font-weight: bold; color: blue; } .remark-slide-content { font-size: 23px; padding: 1em 4em 1em 4em; } .highlight-red { background-color:red; padding:0.1em 0.2em; } </style> Perhaps one of the main attractions of DID (or, more generally, panel data approaches to causal inference) is that the main identifying assumptions can be pre-tested -- - That is, one can "validate" the identification strategy by implementing it in pre-treatment periods and making sure that it does not spuriously deliver non-zero effects of the policy -- More precisely, if the policy is implemented in period `\(t^*\)`, if parallel trends holds in all periods, then it should be the case that `$$\E[\Delta Y_{t^*-1} | D=1] - \E[\Delta Y_{t^*-1}| D=0]$$` -- If there is variation in treatment timing, this should hold across all groups in their pre-treatment periods. --- # Event Studies Event studies are often used to <span class="alert-blue">study dynamic effects of the treatment</span> and/or <span class="alert">"pre-test" the parallel trends assumption</span> -- They are often implemented with the following event study regression: `$$Y_{it} = \theta_t + \eta_i + \sum_{e=-(\mathcal{T}-1)}^{-2} D_{it}^e \beta_e + \sum_{e=0}^{\mathcal{T}-2} D_{it}^e \beta_e + v_{it}$$` where `\(D_{it}^e\)` is a dummy variable that is equal to `\(1\)` in period `\(t\)` if individual `\(i\)` has been treated for exactly `\(e\)` periods. For example, -- - `\(D_{it}^0\)` equals 1 for units who become treated in period `\(t\)`, and 0 otherwise. - `\(D_{it}^2\)` equals 1 for units who become treated in period `\(t-2\)`, and 0 otherwise - `\(D_{it}^{-1}\)` equals 1 for units who become treated in period `\(t+1\)`, and 0 otherwise --- # Event study regressions Abraham and Sun (2021) study this sort of event study regression, and note a number of similar issues as we pointed out earlier: -- - `\(\beta_e\)` includes terms that involve treatment effects at different lengths of exposure -- Two particularly problematic cases: -- - Trying to estimate effects after all units become treated -- - Heterogeneous effects across groups at the same length of exposure to the treatment --- # ATT(g,t) version of event study It is very easy to report an event study using `\(ATT(g,t)\)`'s. -- `$$ATT^{ES}(e) = \sum_{g \in \mathcal{G}_e} ATT(g,g+e) \P(G=g|G \in \mathcal{G}_e)$$` where `\(\mathcal{G}_e = \{ g \in \mathcal{G} : g + e \leq \mathcal{T}\}\)`. (i.e., the set of groups that we observe to particpate in the treatment for at least `\(e\)` periods) --- # Example: Minimum Wage ```r library(fixest) library(did) library(ggplot2) load("mw_data2.RData") # create event time variable mw_data2$e <- ifelse(mw_data2$treat==1, mw_data2$year - mw_data2$first.treat, 0) # run event study regression es_reg <- feols(lemp ~ i(e, ref=-1) | year + countyreal, data=mw_data2, cluster="countyreal") ``` --- # Example: Minimum Wage ```r # plot event study regression iplot(es_reg, ylim.add=c(-.5,.5)) ``` <img src="data:image/png;base64,#modern_did_session3_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> --- # Example: Minimum Wage ```r # callaway and sant'anna cs_res <- att_gt(yname="lemp", tname="year", idname="countyreal", gname="first.treat", data=mw_data2) cs_es <- aggte(cs_res, type="dynamic") ``` --- # Example: Minimum Wage ```r ggdid(cs_es, ylim=c(-.25,.2)) ``` <img src="data:image/png;base64,#modern_did_session3_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" /> --- # Example: Simulated Data Recall: parallel trends holds in simulated data ```r # same data as before data <-readRDS("sim_data.RDS") # run event study regression es_reg <- feols(Y ~ i(e, ref=-1) | id + time.period, data=data, cluster="id") ``` --- # Example: Simulated Data ```r # plot event study regression iplot(es_reg) ``` <img src="data:image/png;base64,#modern_did_session3_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> --- # Example: Simulated Data ```r # callaway and sant'anna cs_res <- att_gt(yname="Y", tname="time.period", idname="id", gname="G", data=data, control_group="notyettreated") # aggregate into event study (dynamic effects) cs_es <- aggte(cs_res, type="dynamic") ``` --- # Example: Simulated Data ```r ggdid(cs_es) ``` <img src="data:image/png;base64,#modern_did_session3_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> --- # Example: Simulated Data CS does much better than event study regression here. The main reason is that there is no never treated group here. -- Next, we'll talk about some extensions that (I think) are helpful in applications.