Homework 5

\(\newcommand{\E}{\mathbb{E}} \newcommand{\E}{\mathbb{E}} \newcommand{\var}{\mathrm{var}} \newcommand{\cov}{\mathrm{cov}} \newcommand{\Var}{\mathrm{var}} \newcommand{\Cov}{\mathrm{cov}} \newcommand{\Corr}{\mathrm{corr}} \newcommand{\corr}{\mathrm{corr}} \newcommand{\L}{\mathrm{L}} \renewcommand{\P}{\mathrm{P}} \newcommand{\T}{\mathrm{T}} \newcommand{\independent}{{\perp\!\!\!\perp}} \newcommand{\indicator}[1]{ \mathbf{1}\{#1\} } \newcommand{\N}{\mathcal{N}}\)

Due: At the start of class on Monday, April 14. Please turn in a hard copy.

Question 1 Hansen 25.15, for this question, you need to code the probit estimator yourself (compute both estimates of the parameters and their standard errors); i.e., you cannot use built-in R functions such as glm. Hint: you can use R’s optimization routines such as optim. In addition to what is asked in 25.15,

Compare the estimates that you get to those coming from the glm function
Calculate and report average partial effects for each regressor.
Derive the asymptotic variance for the average partial effects.

Hint: It is helpful to notice that \[\begin{align*} \sqrt{n}(\widehat{APE} - APE) &= \sqrt{n}\left( \frac{1}{n} \sum_{i=1}^n \phi(X_i'\hat{\beta}) \hat{\beta} - \E[\phi(X'\beta) \beta] \right) \\ &= \sqrt{n}\left( \frac{1}{n} \sum_{i=1}^n \phi(X_i'\hat{\beta}) \hat{\beta} - \frac{1}{n} \sum_{i=1}^n \phi(X_i'\hat{\beta}) \beta \right) \\ & + \sqrt{n} \left(\frac{1}{n} \sum_{i=1}^n \phi(X_i'\hat{\beta}) \beta - \frac{1}{n} \sum_{i=1}^n \phi(X_i'\beta) \beta\right) \\ & + \sqrt{n} \left( \frac{1}{n} \sum_{i=1}^n \phi(X_i'\beta) \beta - \E[\phi(X'\beta) \beta] \right) \end{align*}\] which holds just by adding and subtracting some terms, and, just to be clear, the notation above is for the entire vector of average partial effects for all regressors. Figuring out the asymptotic distributions for the first and last lines is not too hard, but the middle expression requires using some kind of mean value theorem type of argument (as we have done before in the context of the delta method).
Based on your result in part (c), compute the standard error of each average partial effect (again, use your own code to compute these).
Use the bootstrap to calculate standard errors for the average partial effects that you calculated in part (b). Compare these standard errors to the analytical standard errors that you calculated in part (d).

Question 2: For this problem, we will be interested in computing an estimate of the \(ATT\) of a job training program. You can download the data here and download a description here. For this problem, the outcome of interest is re78, the treatment is train, and suppose that unconfoundedness holds after conditioning on age, educ, black, hisp, married, re75, and unem75.

Given our discussion in class, we know that if we believe unconfoundedness and that untreated potential outcomes are linear in covariates, then we have that

\[\begin{align*} ATT = \E[Y|D=1] - \E[X'|D=1]\beta_0 \end{align*}\]

where \(\beta_0\) can be estimated from the regression of \(Y\) on \(X\) using untreated observations. Estimate \(ATT\) based on the above expression for it and report your result.
Show that \(\sqrt{n}(\widehat{ATT} - ATT) \xrightarrow{d} \N(0,V)\) and provide an expression for \(V\). Based on this result, provide standard errors for your estimate of \(ATT\).
Use the bootstrap to compute standard errors for your estimate of \(ATT\). How do these compare to the standard errors that you reported previously?
Run a regression of \(Y\) on \(D\) and \(X\) (where \(X\) includes the same additional variables as above). Compare the coefficient on \(D\) to the estimate of \(ATT\). How similar are they? What about their standard errors? Do you have any comment on these results?
Calculate an estimate of \(ATT\) using propensity score re-weighting (as we discussed in class). You can estimate the propensity score model using logit (it is ok to use the glm function for this). Compute standard errors using the bootstrap and compare your results to the previous estimates.
Calculate an estimate of \(ATT\) using the doubly robust AIPW estimator that we discussed in class. Again, you can estimate the propensity using logit from the glm function. Compute standard errors using the bootstrap and compare your results to the previous estimates.
Finally, calculate an estimate of \(ATT\) using machine learning. For this problem, use the algorithm we discussed in class to estimate the \(ATT\) using machine learning. In particular, I’d like for you to use the ranger package to estimate the propensity score and the outcome regression model. Compute standard errors using the bootstrap and compare your results to the previous estimates.

Question 3 Prove the following result from the course note about interpreting \(\alpha\) from the following regression: \(Y = \alpha D + X'\beta + e\). You can use the decompositions that we discussed in class as a starting point. Prove the result for each case mentioned below separately and also show the result about the weights in case (i).

Suppose that unconfoundedness and overlap both hold. In addition, suppose that either (i) \(p(X) = \L(D|X)\) or (ii) \(\E[Y|X,D=0] = \L_0(Y|X)\), then \[\begin{align*} \alpha &= \E\left[w(D,X) CATE(X) \right] \end{align*}\] where \(w(D,X)\) are defined above and have mean 1. In addition, if condition (i) holds (that \(p(X) = \L(D|X)\)), then the weights are non-negative.