load("rand_hie.RData")
rand_hie_subset <- subset(rand_hie, plan_type %in% c("Catastrophic", "Free"))
rand_hie_subset$free <- 1 * (rand_hie_subset$plan_type == "Free")Homework 5 Solutions
Chapter 18, Coding Question 1
Part (a)
Yes, you can estimate the ATT for the Free plan relative to Catastrophic just by running a regression of total_med_expenditure on free. The reason this works is because the plan type is randomly assigned in this problem, and, if you run a regression of an outcome on the single binary variable free, the coefficient on the binary variable will give you the difference in the average outcomes for units with free=1 relative to those with free=0—which is exactly equal to \(\widehat{ATT}\).
Part (b)
exp_reg <- lm(total_med_expenditure ~ free, data = rand_hie_subset)
summary(exp_reg)
Call:
lm(formula = total_med_expenditure ~ free, data = rand_hie_subset)
Residuals:
Min 1Q Median 3Q Max
-532.9 -392.8 -299.4 38.4 17987.6
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 392.77 40.19 9.773 <2e-16 ***
free 140.12 49.74 2.817 0.0049 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 993.4 on 1758 degrees of freedom
Multiple R-squared: 0.004493, Adjusted R-squared: 0.003927
F-statistic: 7.935 on 1 and 1758 DF, p-value: 0.004903
Relative to having only “catastrophic” insurance coverage, total medical expenditure (notice that total medical expenditure includes both how much the person paid themselves plus how much their insurance paid) is estimated to be substantially higher, on average, for individuals assigned to “free” insurance (i.e., that paid nothing for medical care); in particular, we estimate that “free” insurance results about $140 more, on average, than those with only catastrophic coverage. In my view, this difference is large in magnitude as the average expenditure is $393 for those with catastrophic coverage, which implies that those with free coverage have 36% higher total medical expenditures. Since individuals were randomly assigned to a type of plan, it seems reasonable to interpret these results as being a causal effect of plan type on total medical spending.
Part (c)
visits_reg <- lm(face_to_face_visits ~ plan_type, data = rand_hie_subset)
summary(visits_reg)
Call:
lm(formula = face_to_face_visits ~ plan_type, data = rand_hie_subset)
Residuals:
Min 1Q Median 3Q Max
-4.928 -3.192 -1.792 0.808 91.672
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.1917 0.2562 12.457 < 2e-16 ***
plan_typeFree 1.7361 0.3171 5.475 5.01e-08 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 6.333 on 1758 degrees of freedom
Multiple R-squared: 0.01676, Adjusted R-squared: 0.01621
F-statistic: 29.98 on 1 and 1758 DF, p-value: 5.007e-08
These results are broadly similar to the ones before. Individuals assigned to the “free” insurance plan had, on average, 1.7 more face to face visits with doctors. This is 54% more than individuals randomly assigned to the “catastrophic” insurance plan. As in part (a), it seems reasonable to interpret these as causal effects due to the random assignment.
Part (d)
health_reg <- lm(health_index ~ plan_type, data = rand_hie_subset)
summary(health_reg)
Call:
lm(formula = health_index ~ plan_type, data = rand_hie_subset)
Residuals:
Min 1Q Median 3Q Max
-60.525 -9.784 1.516 10.616 32.216
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 68.5247 0.6190 110.698 <2e-16 ***
plan_typeFree -0.7407 0.7661 -0.967 0.334
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 15.3 on 1758 degrees of freedom
Multiple R-squared: 0.0005315, Adjusted R-squared: -3.708e-05
F-statistic: 0.9348 on 1 and 1758 DF, p-value: 0.3338
These results are different from the previous ones. Although individuals assigned to the “free” insurance plan appear to be utilizing more medical care, it does not appear to be improving their health (at least according to this measure of an individual’s health). The results here are not statistically significant and quantitatively small; for example, here we estimate that individuals in the “free” insurance plan have about 1% lower health index, on average, than those in the “catastrophic” plan.
Part (e)
Parts (b)-(d) seem to suggest that “free” insurance increased medical care usage without much of an effect on health (at least in the way that we were able to measure health).
Chapter 18, Coding Question 2
Part (a)
load("Fertility.RData")
head(Fertility) morekids boy1st boy2nd samesex agem1 black hispan othrace weeksm1
1 0 1 0 0 27 0 0 0 0
2 0 0 1 0 30 0 0 0 30
3 0 1 0 0 27 0 0 0 0
4 0 1 0 0 35 1 0 0 0
5 0 0 0 1 30 0 0 0 22
6 0 1 0 0 26 0 0 0 40
weeks_reg <- lm(weeksm1 ~ morekids, data = Fertility)
summary(weeks_reg)
Call:
lm(formula = weeksm1 ~ morekids, data = Fertility)
Residuals:
Min 1Q Median 3Q Max
-21.07 -21.07 -13.68 24.93 36.32
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 21.06843 0.05466 385.4 <2e-16 ***
morekids -5.38700 0.08861 -60.8 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 21.71 on 254652 degrees of freedom
Multiple R-squared: 0.01431, Adjusted R-squared: 0.0143
F-statistic: 3696 on 1 and 254652 DF, p-value: < 2.2e-16
For this regression to deliver an estimate of the causal effect of having more than two kids on weeks worked per year, we would need the following unconfoundedness assumption to hold \[ \big( Y(1), Y(0) \big) \perp \!\!\! \perp D \]
where \(Y\) is the number of hours worked per week and \(D\) indicates whether or not a mother has more than two kids. However, there are probably many reasons why this assumption could fail. Perhaps the most likely one is that mother’s decisions about having more children are related to unobserved factors that also affect their labor supply decisions. For example, mother’s with a stronger preference for working may choose to have fewer children. Alternatively, there could be unobserved factors related to family support networks, where mother’s with more family support may be more likely to have more children and also more likely to work more weeks per year. Overall, it seems quite likely to me that this unconfoundedness assumption does not hold
If this assumption were to hold, we would interpret the result as indicating that having more than children causes mother’s to work about 5.4 weeks less, on average, than they would have worked if they had only two children.
Part (b)
Relevance - Relevance requires that the instrument (the first two children having the same sex) actually affects the probability of mother’s being treated (having more than two children). If we maintain the independence assumption, we can test this assumption by just checking if women with their first two children having the same sex are actually more likely to have more children:
rel_reg <- lm(morekids ~ samesex, data=Fertility)
summary(rel_reg)
Call:
lm(formula = morekids ~ samesex, data = Fertility)
Residuals:
Min 1Q Median 3Q Max
-0.4139 -0.4139 -0.3464 0.5860 0.6536
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.346425 0.001365 253.79 <2e-16 ***
samesex 0.067525 0.001920 35.17 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.4844 on 254652 degrees of freedom
Multiple R-squared: 0.004835, Adjusted R-squared: 0.004831
F-statistic: 1237 on 1 and 254652 DF, p-value: < 2.2e-16
which indeed suggests that having the first two children with the same sex increases the probability of having more kids by 6.7 percentage points.
Independence - Independence requires that the instrument (the first two children having the same sex) be independent of potential outcomes and potential treatment. I am not an expert, but my understanding is the sex of a baby is essentially random, and this would imply that independence holds.
Exclusion Restriction - The exclusion restriction requires that the only way that the instrument (the first two children having the same sex) affects the outcome (hours worked per year) is through the treatment (having more children). Perhaps it is possible to come up with some argument against the exclusion restriction, but, to me, it would seem surprising if the number of weeks that a woman works depends on the sex composition of her children.
Monotonicity - Monotonicity rules out there being any defiers. In the context of this problem, an always taker is a mother who would have more than children regardless of the sex composition of her first two children; a never taker is a mother who is going to have exactly two children regardless of the sex composition of her two children; a complier is a mother who will have more than two children if her first two children have the same sex but will have exactly two children if her first two children include a boy and a girl; and a defier is a mother who would have more than two children if she had a boy and a girl but would only have two children if her first two children have the same sex. In my view, this assumption is probably the one that is most up for debate. Monotonicity would rule out things like a family wanting to have a set of siblings with the same sex. While this is probably not a “typical” preference for mother’s in the United States, monotonicity requires no one to have this type of preference (i.e., that there are no defiers at all), and it certainly seems possible that some people could have these types of preferences.
Part (c)
library(estimatr)
weeks_iv <- iv_robust(weeksm1 ~ morekids | samesex, data=Fertility)
summary(weeks_iv)
Call:
iv_robust(formula = weeksm1 ~ morekids | samesex, data = Fertility)
Standard error type: HC2
Coefficients:
Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
(Intercept) 21.421 0.4873 43.963 0.000e+00 20.466 22.376 254652
morekids -6.314 1.2747 -4.953 7.308e-07 -8.812 -3.815 254652
Multiple R-squared: 0.01388 , Adjusted R-squared: 0.01388
F-statistic: 24.53 on 1 and 254652 DF, p-value: 7.308e-07
When we use samesex as an instrument, we get fairly similar results to the ones coming from the regression. In particular, we estimate that having more than two kids reduces the number of weeks worked per year by 6.3, on average (this is a LATE, so it is the average treatment effect for compliers).
Chapter 18, Coding Question 4
load("house.RData")
# if Democrat won first election, then we have a Democrat incumbent
house$D_incumbent <- 1*(house$Dmargin1 > 0)Part (a)
The incumbent variable is just an indicator for whether or not a Democrat won the previous election. But there are many reasons why a Democrat winning the previous election would be related to Democratic vote share in the current election that are distinct from effects of incumbency. Primarily, voter preferences for different political parties are likely very persistent, i.e., locations that voted for a Democrat in the past are probably more likely to vote for a Democrat in the current election, whether or not that candidate is an incumbent.
Here is what we get if we actually run this regression.
inc_reg <- lm(Dmargin2 ~ D_incumbent, data=house)
summary(inc_reg)
Call:
lm(formula = Dmargin2 ~ D_incumbent, data = house)
Residuals:
Min 1Q Median 3Q Max
-0.69788 -0.10061 -0.00360 0.09631 0.65348
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.346522 0.003201 108.25 <2e-16 ***
D_incumbent 0.351358 0.004195 83.75 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1676 on 6556 degrees of freedom
Multiple R-squared: 0.5169, Adjusted R-squared: 0.5168
F-statistic: 7014 on 1 and 6556 DF, p-value: < 2.2e-16
If we interpret the coefficient on D_incumbent causally, it would say that being an incumbent increased the vote share by 35 percentage points—this is a huge effect and, as discussed above, this regression should almost certainly not be interpreted as providing an estimate of the causal effect of incumbency on vote share.
Part (b)
We could use this data to focus on elections that were close, where either a Democrat won by a small margin or lost by a small margin. Then, we can compare the outcome in elections with a Democrat incumbent who “just barely” wone to the vote margin in elections without a Democrat incumbent due to the Democratic candidate having “just barely” lost the previous election.
Part (c)
The continuity assumptions says that \(\mathbb{E}[Y(1)|R=r]\) and \(\mathbb{E}[Y(0)|R=r]\) are both continuous at \(R=c\), where \(c\) is the cutoff. In this problem, \(Y(1)\) is the Democratic vote share in the current election if there is a Democratic incumbent, \(Y(0)\) is the Democratic vote share in the current election if there is not a Democratic incumbent, \(R\) is the Democratic margin of victory in the previous election, and \(c=0\). Thus, the continuity assumption requires that (1) the expected Democratic vote share in the current election if there is a Democratic incumbent does not change rapidly (discontinuously) near the cutoff, and (2) the expected Democratic vote share in the current election if there is not a Democratic incumbent does not change rapidly (discontinuously) near the cutoff. In my view, these assumptions seem plausible.
Part (d)
We can estimate the causal effect of incumbency using the following regression:
\[ Y = \beta_0 + \beta_1 Dincumbent + \beta_2 Dmargin1 + \beta_3 Dincumbent * Dmargin1 + U \]
where \(Y\) is the Democratic vote share in the current election, \(Dincumbent\) is an indicator for whether or not there is a Democratic incumbent, and \(Dmargin1\) is the Democratic margin of victory in the previous election, and where we will only use data from elections where the previous election was close (below I use a bandwidth of 10 percentage points, but you could make a different choice here). We will be interested in \(\beta_1\) from the regression, which captures the discontinuous jump in expected Democratic vote share at the cutoff for previous election margins of victory.
# bandwidth => use election margins within 10 percentage points
h <- 0.1
rd_data <- subset(house, abs(Dmargin1) <= h)
rd_reg <- lm(Dmargin2 ~ D_incumbent + D_incumbent*Dmargin1, data=rd_data)
summary(rd_reg)
Call:
lm(formula = Dmargin2 ~ D_incumbent + D_incumbent * Dmargin1,
data = rd_data)
Residuals:
Min 1Q Median 3Q Max
-0.56009 -0.05455 -0.00635 0.04262 0.47210
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.464015 0.009389 49.419 < 2e-16 ***
D_incumbent 0.060579 0.012994 4.662 3.48e-06 ***
Dmargin1 0.644045 0.163125 3.948 8.33e-05 ***
D_incumbent:Dmargin1 0.004893 0.225778 0.022 0.983
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.111 on 1205 degrees of freedom
Multiple R-squared: 0.2575, Adjusted R-squared: 0.2557
F-statistic: 139.3 on 3 and 1205 DF, p-value: < 2.2e-16
Using this approach, we estimate that having a Democratic incumbent increases the Democratic vote share by about 6 percentage points, on average, and that the effect is statistically significant. You can also notice that this is a much smaller (and more plausible) effect than the one we got in part (a) without using the RDD approach.