I am going to add one or two additional questions on GMM, depending on the exact material that we cover next week.
Due: Please turn in at my office Amos B408 on May 2 at 9am.
Textbook Questions: Hansen 17.2
Additional Question 1:
For this problem, use the data
job_displacement_clean2.RData
posted
here. For this problem, we are interested in estimating the (causal)
effect of job displacement on earnings. The key variables in this data
are learn
(which is the log of earnings and is the outcome
variable in this problem) and first.displaced
(which
contains the time period when an individual becomes displaced) —
first.displaced
is the variable that is used to form
“groups” below. It is set equal to 0 for individuals that are not
displaced from their job in any period. (As always, you are not allowed
to use built-in R
package, such as plm
,
fixest
or did
, to compute the results to these
questions though you are welcome to compare your results to output from
those packages.)
Hint: The number of observations is fairly large in
this data and, therefore, inverting some matrices can be quite
time-consuming. I recommend that you use the Matrix
package
to deal with the matrices in this question. This package has the ability
to deal with “sparse” matrices in an efficient way which is helpful for
this problem.
Use the within estimator to estimate the following regression model \[\begin{align*} Y_{it} = \theta_t + \eta_i + \alpha D_{it} + e_{it} \end{align*}\] Report an estimate of \(\alpha\) and its standard error.
Compute difference-in-differences versions of group-time average treatment effects (where you can define group by the time period when an individual becomes displaced from their job). Use individuals who are never displaced from their job as the comparison group.
Aggregate the group-time average treatment effects in part (b) into an overall treatment effect parameter. Use the bootstrap to compute a standard error for it. How do these results compare to the ones from parts (a)?
Compute the weights on underlying group-time average treatment effects that come from the two-way fixed effects regression in part (a). How do these compare to the the weights from part (c).
Aggregate the group-time average treatment effects in part (b) into an event study. Use the bootstrap to compute standard errors, and plot the estimates along with a 95% confidence interval.
Additional Question 2
Suppose you are interested in the structural model \(Y = X'\beta + e\), but where \(\textrm{E}[Xe] \neq 0\). Suppose that you have access to an \(l \times 1\) vector of instruments \(Z\) (where \(l > k\) with \(k\) being the dimension of \(X\)) that satisfy \(\textrm{E}[Ze]=0\). Suppose that you estimate \(\beta\) using GMM. That is, you calculate \[\begin{align*} \hat{\beta}_{gmm} &= \underset{b}{\textrm{argmin}\ } \Big( \mathbf{Z}'\mathbf{Y}-\mathbf{Z}'\mathbf{X}b \Big)'\widehat{\mathbf{W}} \Big(\mathbf{Z}'\mathbf{Y}-\mathbf{Z}'\mathbf{X}b\Big) \end{align*}\] where \(\widehat{\mathbf{W}}\) is an \(l \times l\) weighting matrix that satisfies \(\widehat{\mathbf{W}} \xrightarrow{p} \mathbf{W}\) which is a positive definite matrix (also, here \(\mathbf{Y}\), \(\mathbf{X}\), and \(\mathbf{Z}\) are all data matrices). Derive an explicit expression for \(\hat{\beta}_{gmm}\) and show that \(\sqrt{n}(\hat{\beta}_{gmm} - \beta) \xrightarrow{d} \mathcal{N}(0,\mathbf{V})\), and provide an expression for \(\mathbf{V}\).
Additional Question 3 It is not required to turn this problem in, but I recommend giving it a try.
For this problem, we will consider a smaller scale version of the problem in Additional Question 1. In particular, this question will focus on estimating group-time average treatment effects using GMM.
To start with, after loading the same data as in Additional Question 1, for this question, please run the following code:
# limit time periods to 2001, 2003, 2005, 2007, and groups to 2003,
# 2005, 2007, and untreated
data <- subset(data, (year <= 2007))
data <- subset(data, first.displaced %in% c(0, 2003, 2005, 2007))
To keep the notation simple, below 2001 will be \(t=1\), 2003 will be \(t=2\), 2005 will be \(t=3\), and 2007 will be \(t=4\). Also, for simplicity, you can treat \(p_g := \textrm{P}(G=g)\) as being known. We will make the parallel trends assumption that, for all \(t=2,\ldots,4\) and for all groups \(g\), \(\textrm{E}[\Delta Y_t(0) | G=g] = \textrm{E}[\Delta Y_t(0)]\).
Please state all the non-redundant moment conditions that are implied by the parallel trends assumption. As a hint, I will give you two of them \[\begin{align*} \textrm{E}\left[\left(\frac{\mathbf{1}\{G=2\} }{p_2} - \frac{U}{p_U}\right) \Delta Y_2 \right] - ATT(2,2) &= 0 \\ \textrm{E}\left[\left(\frac{\mathbf{1}\{G=3\} }{p_3} - \frac{U}{p_U}\right) \Delta Y_2 \right] &= 0 \end{align*}\] where the first condition says that \(ATT(2,2)\) is equal to the mean path of outcomes for group 2 relative to the untreated group, and the second condition says that the mean path of outcomes in period 2 should be the same for group 3 and the untreated group (because this is a pre-treatment period for group 3).
Using the weighting matrix \(\mathbf{W} = \mathbf{I}_l\) where \(l\) is the number of moment conditions from part (a), compute estimates of \(ATT(g,t)\) for all post-treatment periods for groups 2, 3, and 4. How do these estimates compare to the corresponding ones from Additional Question 1?
Now, compute the efficient GMM estimator of all of the \(ATT(g,t)\)’s mentioned in part (c). To estimate the efficient weighting matrix, you can use the preliminary estimates of the \(ATT(g,t)\)’s from part (c). How do these estimates compare to the ones from part (b)?
Compute a J-test for over-identification. How do you interpret the results?