Athey and Imbens (2006)

Brantly Callaway

University of Georgia

February 5, 2025

Introduction

\(\newcommand{\E}{\mathbb{E}} \newcommand{\E}{\mathbb{E}} \newcommand{\var}{\mathrm{var}} \newcommand{\cov}{\mathrm{cov}} \newcommand{\Var}{\mathrm{var}} \newcommand{\Cov}{\mathrm{cov}} \newcommand{\Corr}{\mathrm{corr}} \newcommand{\corr}{\mathrm{corr}} \newcommand{\F}{\mathrm{F}} \newcommand{\L}{\mathrm{L}} \renewcommand{\P}{\mathrm{P}} \newcommand{\T}{\mathrm{T}} \newcommand{\independent}{{\perp\!\!\!\perp}} \newcommand{\indicator}[1]{ \mathbf{1}\{#1\} }\)

  • Two periods of panel data: \(t=1,2\)

  • No one treated at \(t=1\)

  • At period \(t=2\), some units become treated. \(D_i = 1\) if treated, \(0\) otherwise.

  • Potential outcomes in each time period: \(Y_{it}(1)\) and \(Y_{it}(0)\)

  • Observed outcomes: \(Y_{it=2} = D_i Y_{it=2}(1) + (1-D_i) Y_{it=2}(0)\) and \(Y_{it=1} = Y_{it=1}(0)\)

Target Parameters

  • Average treatment effect on the treated (ATT):

\[ ATT = \E[Y_{t=2}(1) - Y_{t=2}(0) | D=1] \]

  • Quantile treatment effect on the treated (QTT):

    \[ QTT(\tau) = Q_{Y_{t=2}(1) | D=1}(\tau) - Q_{Y_{t=2}(0) | D=1}(\tau) \]

    • Note: \(QTT(\tau)\) is identified if \(\F_{Y_{t=2}(0) | D=1}\) is identified.

Model for Untreated Potential Outcomes

They assume that:

\[Y_{it}(0) = h_t(U_{it})\]

where \(h_t\) is a nonparametric, time-varying function. This model generalizes the typical model that is used to rationalize difference-in-differences:

\[Y_{it}(0) = \theta_t + \underbrace{\eta_i + e_{it}}_{U_{it}}\]

Additional Assumptions

  1. \(U_{t} \overset{d}{=} U_{t'} | G\). In words: the distribution of \(U_{t}\) does not change over time given a particular group. However, the distribution of \(U_{t}\) can vary across groups.

  2. \(U_{t}\) is scalar

  3. \(h_t\) is stictly monotonically increasing \(\implies\) we can invert it.

  4. Support condition: \(\mathcal{U}_g \subseteq \mathcal{U}_0\) (support of \(U_{t}\) for the treated group is a subset of the support of \(U_{t}\) for the untreated group)

    • \(\implies \textrm{support}(Y_{t}(0))\) for the treated group is a subset of the of the support of untreated potential outcomes for the untreated group.
    • This condition is not required for the DID (I think: related to whether or not we can use extrapolations (for DiD: yes, for CiC: no); DiD can have fundamentally different groups in terms of \(\eta_i\), but not here)

Identification

We will show that

\[\E[Y_{t=2}(0) | D=1] = \E\left[ Q_{Y_{t=2}|D=0}\Big(F_{Y_{t=1}|D=0}\big(Y_{t=1}\big)\Big) \Big| D=1 \right]\]

under the conditions mentioned above

Preliminary Result 1

Notice that

\[ \begin{aligned} \F_{Y_{t=1}(0) | D=1}(y) &= \P\big( Y_{t=1}(0) < y \big| D=1 \big) \\ &= \P\big( h_{t=1}(U_{t=1}) < y \big| D=1 \big) \\ &= \P\big( U_{t=1} < h^{-1}_{t=1}(y) \big| D=1 \big) \\ &= \F_{U|D=1}\big( h^{-1}_{t=1}(y)\big) \end{aligned} \]

  • Notice that the same argument would apply for \(t=2\) or \(D=0\).

Preliminary Result 2

Define, for some \(\tau \in [0,1]\), \(y_\tau = Q_{Y_{t=2}(0)|D=1}(\tau)\). Then it follows from previous result that

\[ \begin{aligned} \tau = \F_{Y_{t=2}(0) | D=1}(y_\tau) = \F_{U|D=1}\big(h^{-1}_{t=2}(y_\tau)\big) \end{aligned} \]

which further implies that

\[ \begin{aligned} y_\tau = h_{t=2}\big( \F_{U|D=1}^{-1}(\tau)\big) \end{aligned} \]

Combining Preliminary Result 1 and 2

Setting \(\tau = F_{Y_{t=1}(0)|D=1}(y)\) and using preliminary result 2, we have that \[ \begin{aligned} Q_{Y_{t=2}(0)|D=1}\big( \F_{Y_{t=1}(0)|D=1}(y) \big) &= h_{t=2}\Big( \F_{U|D=1}^{-1}\big( \F_{Y_{t=1}(0)|D=1}(y) \big)\Big) \\ &= h_{t=2}\left\{ \F_{U|D=1}^{-1}\Big( \F_{U|D=1}\big( h^{-1}_{t=1}(y)\big) \Big) \right\} \\ &= h_{t=2}\big( h^{-1}_{t=1}(y)\big) \end{aligned} \]

  • where the second equality uses preliminary result 1

  • Notice that this term doesn’t depend on \(D=1\), and we can use symmetric arguments to show that

    \[ Q_{Y_{t=2}(0)|D=1}\big( \F_{Y_{t=1}(0)|D=1}(y) \big) = h_{t=2}\big( h^{-1}_{t=1}(y)\big) = Q_{Y_{t=2}(0)|D=0}\big( \F_{Y_{t=1}(0)|D=0}(y) \big)\]

    which holds for any \(y\)

Proof of Main Result

Noticing that, conditional on \(D=1\), \(\F_{Y_{t=1}(0)|D=1}\big(Y_{t=1}(0)\big) \sim U[0,1]\)

\[ \begin{aligned} \E[Y_{t=2}(0) | D=1] &= \E\left[ Q_{Y_{t=2}(0)|D=1}\Big( \F_{Y_{t=1}(0)|D=1}\big(Y_{t=1}(0)\big) \Big) \middle| D=1 \right] \\ &= \E\left[ Q_{Y_{t=2}(0)|D=0}\Big( \F_{Y_{t=1}(0)|D=0}\big(Y_{t=1}(0)\big) \Big) \middle| D=1 \right] \end{aligned} \]

where the second equality holds by our preliminary results and completes the proof \(\implies ATT\) is identified.

Extensions

(Unlike DiD), essentially the same arguments can be used to recover \(\P(Y_t(0) < y | D=1)\).

That expression can be used to recover quantile treatment effects

\[Q_{Y_{t=2}(0)|D=1}(\tau) = Q_{Y_{t=2}(0)|D=0}\Big( \F_{Y_{t=1}(0)|D=0}\big( Q_{Y_{t=1}(0)|D=1}(\tau) \big) \Big) \]

Intuition

One way to view DiD is as a before-after comparison but where \(Y_{t-1}(0)\) is adjusted to account for time trends

\[\E[Y_{t=2}(0) | D=1] = \E[Y_{t=1}(0) + \textrm{time adjustment} | D=1]\]

where \(\textrm{time adjustment} = \E[\Delta Y_{t=2}(0) | D=0]\).

I think it is fair to view CiC similarly, but adjusting for time in a different way

\[ \E[Y_{t=2}(0) | D=1] = \E\left[ \underbrace{Q_{Y_{t=2}(0)|D=0}\Big( \F_{Y_{t=1}(0)|D=0}\big(}_{\textrm{time adjustment}}Y_{t=1}(0)\big) \Big) \middle| D=1 \right] \]

Intuition

Quantile Difference-in-Differnces

Assume:

\[\underbrace{Q_{Y_{t=2}(0)|D=1}(\tau)} - Q_{Y_{t=1}(0)|D=1}(\tau) = Q_{Y_{t=2}(0)|D=0}(\tau) - Q_{Y_{t=1}(0)|D=0}(\tau)\]

They show that this is rationalized under a different model for untreated potential outcomes, effectively (I think) correlated random effects:

\[ Y_{it}(0) = \theta_t + \underbrace{\gamma}_g + e_{it} \]

with \(e_{it} | G=g \sim F_e\)

Rank Invariance

Hold rank across time periods?? Not needed above, but interesting to think about (what if potential outcomes not fixed for each unit?)