Exercise 1 Solutions

This exercise will involve estimating causal effect parameters using a difference-in-differences identification strategy that involves conditioning on covariates in the parallel trends assumption and possibly allows for anticipation effects.

In particular, we will use data from the National Longitudinal Study of Youth to learn about causal effects of job displacement (where job displacement roughly means “losing your job through no fault of your own” — a mass layoff is a main example).

To start with, load the data from the file job_displacement_data.RData by running

load("job_displacement_data.RData")

which will load a data.frame called job_displacement_data. This is what the data looks like

head(job_displacement_data)
       id year group income female white occ_score
1 7900002 1984     0  31130      1     1         4
2 7900002 1985     0  32200      1     1         3
3 7900002 1986     0  35520      1     1         4
4 7900002 1987     0  43600      1     1         4
5 7900002 1988     0  39900      1     1         4
6 7900002 1990     0  38200      1     1         4

You can see that the data contains the following columns:

  • id - an individual identifier
  • year - the year for this observation
  • group - the year that person lost his/her job. group=0 for those that do not lose a job in any period being considered.
  • income - a person’s wage and salary income in this year
  • female - 1 for females, 0 for males
  • white - 1 for white, 0 for non-white

For the results below, we will mainly use the did package which you can install using install.packages("did"), and you can load it using

library(did)

Question 1

We will start by computing group-time average treatment effects without including any covariates in the parallel trends assumption.

  1. Use the did package to compute all available group-time average treatment effects.
Solutions:
no_covs <- att_gt(yname="income",
                  tname="year",
                  idname="id",
                  gname="group",
                  data=job_displacement_data)
Warning in pre_process_did(yname = yname, tname = tname, idname = idname, :
Dropped 26 units that were already treated in the first period.
summary(no_covs)

Call:
att_gt(yname = "income", tname = "year", idname = "id", gname = "group", 
    data = job_displacement_data)

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 

Group-Time Average Treatment Effects:
 Group Time    ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
  1985 1985  -9455.7583   3861.042   -19528.8115    617.2948  
  1985 1986 -14981.1547   4673.005   -27172.5348  -2789.7746 *
  1985 1987  -6129.2132   4200.985   -17089.1417   4830.7153  
  1985 1988  -4815.9179   4980.184   -17808.6958   8176.8600  
  1985 1990  -8011.9173   5466.432   -22273.2656   6249.4310  
  1985 1991  -8164.4924   6444.642   -24977.8874   8648.9026  
  1985 1992  -6325.8880   5779.045   -21402.8090   8751.0330  
  1985 1993  -9669.5840   5527.646   -24090.6329   4751.4649  
  1986 1985  -1801.9373   2599.738    -8584.3796   4980.5051  
  1986 1986  -1919.4474   3510.325   -11077.5154   7238.6207  
  1986 1987  -2596.8189   4846.428   -15240.6413  10047.0034  
  1986 1988  -2081.7535   6986.801   -20309.5851  16146.0782  
  1986 1990  -6064.0942   6633.956   -23371.3876  11243.1992  
  1986 1991  -5903.9636   6766.574   -23557.2437  11749.3164  
  1986 1992  -6804.4833   7425.492   -26176.8110  12567.8445  
  1986 1993  -1801.5755   7455.891   -21253.2138  17650.0628  
  1987 1985   4518.5745   5248.997    -9175.5067  18212.6557  
  1987 1986  -8012.4879   4547.337   -19876.0119   3851.0360  
  1987 1987   7048.8565   6692.480   -10411.1225  24508.8355  
  1987 1988   4489.4666   6807.248   -13269.9275  22248.8608  
  1987 1990   8004.1361   6737.479    -9573.2386  25581.5108  
  1987 1991   9475.0656   7284.194    -9528.6318  28478.7630  
  1987 1992   8533.5413  10080.963   -17766.6333  34833.7160  
  1987 1993   7881.3931   7330.973   -11244.3452  27007.1314  
  1988 1985  -8350.7706   4631.991   -20435.1498   3733.6085  
  1988 1986  -3420.8529   3637.104   -12909.6754   6067.9695  
  1988 1987  -3617.6742   3493.304   -12731.3380   5495.9897  
  1988 1988  -1173.8167   3174.614    -9456.0523   7108.4190  
  1988 1990    280.6263   5825.393   -14917.2132  15478.4658  
  1988 1991   6099.7271   4046.620    -4457.4799  16656.9341  
  1988 1992  13737.8166  12846.521   -19777.4078  47253.0410  
  1988 1993   1688.7819   8220.831   -19758.5022  23136.0659  
  1990 1985  -5281.5363   3433.800   -14239.9612   3676.8886  
  1990 1986   3654.1728   2560.200    -3025.1209  10333.4664  
  1990 1987   5934.8952   3040.891    -1998.4696  13868.2599  
  1990 1988   1034.1988   3161.435    -7213.6530   9282.0505  
  1990 1990  -4343.9488  12267.489   -36348.5390  27660.6414  
  1990 1991 -21910.2102   4908.635   -34716.3237  -9104.0966 *
  1990 1992 -15365.9271   3994.383   -25786.8532  -4945.0010 *
  1990 1993 -16411.1053   6244.878   -32703.3363   -118.8743 *
  1991 1985    891.2874   3590.076    -8474.8439  10257.4186  
  1991 1986  -2816.6357   3576.210   -12146.5924   6513.3211  
  1991 1987  -1340.0549   3135.896    -9521.2781   6841.1683  
  1991 1988  -7025.0387   3718.362   -16725.8557   2675.7782  
  1991 1990   2568.6223   6028.719   -13159.6720  18296.9167  
  1991 1991 -12150.6450   4287.933   -23337.4115   -963.8784 *
  1991 1992   1433.9979   4595.143   -10554.2472  13422.2430  
  1991 1993  -2679.8275   7112.789   -21236.3485  15876.6935  
  1992 1985 -12110.0572   6180.931   -28235.4586   4015.3442  
  1992 1986  -3287.5606   2537.390    -9907.3443   3332.2230  
  1992 1987   2300.0285   3528.032    -6904.2362  11504.2931  
  1992 1988  -7273.9345   2714.144   -14354.8502   -193.0189 *
  1992 1990   7351.4926   4801.529    -5175.1936  19878.1788  
  1992 1991 -10031.7028   7619.266   -29909.5691   9846.1636  
  1992 1992  -8990.8504   4258.882   -20101.8273   2120.1264  
  1992 1993  -8662.6119  14973.627   -47727.2335  30402.0097  
  1993 1985  -7424.6641   5468.588   -21691.6364   6842.3081  
  1993 1986    677.9060   3381.472    -8143.9982   9499.8102  
  1993 1987   1424.1385   3835.053    -8581.1117  11429.3886  
  1993 1988   4778.2556   1746.084      222.9054   9333.6057 *
  1993 1990  -3797.3928   4628.671   -15873.1102   8278.3246  
  1993 1991   3664.8825   6840.202   -14180.4872  21510.2523  
  1993 1992  -4108.9169   5801.940   -19245.5705  11027.7368  
  1993 1993 -22828.3617   6567.402   -39962.0233  -5694.7001 *
---
Signif. codes: `*' confidence band does not cover 0

P-value for pre-test of parallel trends assumption:  0
Control Group:  Never Treated,  Anticipation Periods:  0
Estimation Method:  Doubly Robust
  1. Bonus Question Try to manually calculate \(ATT(g=1992, t=1992)\). Can you calculate exactly the same number as in part (a)?
Solutions:
  mean(subset(job_displacement_data, group==1992 & year==1992)$income) -
    mean(subset(job_displacement_data, group==1992 & year==1991)$income) - 
    ( mean(subset(job_displacement_data, group==0 & year==1992)$income) -
    mean(subset(job_displacement_data, group==0 & year==1991)$income) )
[1] -8990.85
  1. Aggregate the group-time average treatment effects into an event study and plot the results. What do you notice? Is there evidence against parallel trends?
Solutions:
  no_covs_es <- aggte(no_covs, type="dynamic")
  ggdid(no_covs_es)

  1. Aggregate the group-time average treatment effects into a single overall treatment effect. How do you interpret the results?
Solutions:
  no_covs_overall <- aggte(no_covs, type="group")
  summary(no_covs_overall)

Call:
aggte(MP = no_covs, type = "group")

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 


Overall summary of ATT's based on group/cohort aggregation:  
       ATT    Std. Error     [ 95%  Conf. Int.]  
 -5631.049      2009.565  -9569.723   -1692.374 *


Group Effects:
 Group   Estimate Std. Error [95% Simult.  Conf. Band]  
  1985  -8444.241   4700.126    -19889.596    3001.115  
  1986  -3881.734   6126.311    -18800.018   11036.550  
  1987   7572.077   6619.391     -8546.914   23691.067  
  1988   4126.627   4890.516     -7782.349   16035.603  
  1990 -14507.798   4355.249    -25113.338   -3902.258 *
  1991  -4465.492   4715.478    -15948.230    7017.247  
  1992  -8826.731   8360.343    -29185.143   11531.681  
  1993 -22828.362   6462.085    -38564.294   -7092.429 *
---
Signif. codes: `*' confidence band does not cover 0

Control Group:  Never Treated,  Anticipation Periods:  0
Estimation Method:  Doubly Robust

Question 2

A major issue in the job displacement literature concerns a version of anticipation. In particular, there is some empirical evidence that earnings of displaced workers start to decline before they are actually displaced (a rough explanation is that firms where there are mass layoffs typically “struggle” in the time period before the mass layoff actually takes place and this can lead to slower income growth for workers at those firms).

  1. Is there evidence of anticipation in your results from Question 1?
Solutions:

There is a moderate amount of evidence for anticipation in the previous results. It hinges on the estimate for event-time equal to -1. It is negative which is in line with the discussion about anticipation above, but it is only marginally statistically significant.

  1. Repeat parts (a)-(d) of Question 1 allowing for one year of anticipation.
Solutions:
  # part a
  ant_res <- att_gt(yname="income",
                    tname="year",
                    idname="id",
                    gname="group",
                    data=job_displacement_data,
                    anticipation=1)
Warning in pre_process_did(yname = yname, tname = tname, idname = idname, :
Dropped 26 units that were already treated in the first period.
  summary(ant_res)

Call:
att_gt(yname = "income", tname = "year", idname = "id", gname = "group", 
    data = job_displacement_data, anticipation = 1)

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 

Group-Time Average Treatment Effects:
 Group Time    ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
  1986 1985  -1801.9373   2637.305    -8447.6078   4843.7333  
  1986 1986  -3721.3846   3692.454   -13025.8967   5583.1275  
  1986 1987  -4398.7562   4035.808   -14568.4747   5770.9624  
  1986 1988  -3883.6907   6377.636   -19954.5162  12187.1347  
  1986 1990  -7866.0314   6149.536   -23362.0739   7630.0110  
  1986 1991  -7705.9009   6433.556   -23917.6395   8505.8378  
  1986 1992  -8606.4205   6922.425   -26050.0443   8837.2032  
  1986 1993  -3603.5128   6755.388   -20626.2250  13419.1995  
  1987 1985   4518.5745   5460.994    -9242.4313  18279.5803  
  1987 1986  -8012.4879   4701.940   -19860.7751   3835.7992  
  1987 1987   -963.6314   7276.588   -19299.7028  17372.4399  
  1987 1988  -3523.0213   7994.976   -23669.3383  16623.2957  
  1987 1990     -8.3518   6296.508   -15874.7459  15858.0423  
  1987 1991   1462.5776   7589.264   -17661.3964  20586.5516  
  1987 1992    521.0534   9526.404   -23484.2640  24526.3708  
  1987 1993   -131.0948   7403.803   -18787.7296  18525.5399  
  1988 1985  -8350.7706   4674.623   -20130.2233   3428.6820  
  1988 1986  -3420.8529   3463.723   -12148.9916   5307.2858  
  1988 1987  -3617.6742   3619.131   -12737.4217   5502.0734  
  1988 1988  -4791.4908   4401.680   -15883.1619   6300.1802  
  1988 1990  -3337.0478   8180.250   -23950.2306  17276.1349  
  1988 1991   2482.0529   6363.899   -13554.1573  18518.2632  
  1988 1992  10120.1424  14533.152   -26501.5387  46741.8236  
  1988 1993  -1928.8923   7476.284   -20768.1705  16910.3860  
  1990 1985  -5281.5363   3373.666   -13782.7425   3219.6699  
  1990 1986   3654.1728   2480.289    -2595.8377   9904.1833  
  1990 1987   5934.8952   2890.621    -1349.0994  13218.8897  
  1990 1988   1034.1988   3167.434    -6947.3291   9015.7266  
  1990 1990  -4343.9488  12209.279   -35109.7687  26421.8712  
  1990 1991 -21910.2102   4491.669   -33228.6418 -10591.7785 *
  1990 1992 -15365.9271   3899.003   -25190.9152  -5540.9391 *
  1990 1993 -16411.1053   6250.408   -32161.3325   -660.8781 *
  1991 1985    891.2874   3471.665    -7856.8633   9639.4380  
  1991 1986  -2816.6357   3552.765   -11769.1484   6135.8771  
  1991 1987  -1340.0549   3332.791    -9738.2616   7058.1518  
  1991 1988  -7025.0387   3873.773   -16786.4497   2736.3723  
  1991 1990   2568.6223   6363.298   -13466.0749  18603.3196  
  1991 1991  -9582.0227   9188.794   -32736.6064  13572.5611  
  1991 1992   4002.6202   8634.215   -17754.4959  25759.7363  
  1991 1993   -111.2052  10465.683   -26483.3858  26260.9755  
  1992 1985 -12110.0572   6789.117   -29217.7628   4997.6484  
  1992 1986  -3287.5606   2445.489    -9449.8803   2874.7591  
  1992 1987   2300.0285   3451.590    -6397.5352  10997.5921  
  1992 1988  -7273.9345   2662.708   -13983.6169   -564.2522 *
  1992 1990   7351.4926   4587.882    -4209.3820  18912.3673  
  1992 1991 -10031.7028   7928.273   -30009.9359   9946.5304  
  1992 1992 -19022.5532   7204.014   -37175.7451   -869.3614 *
  1992 1993 -18694.3146   8280.700   -39560.6188   2171.9895  
  1993 1985  -7424.6641   5005.794   -20038.6230   5189.2947  
  1993 1986    677.9060   3139.230    -7232.5509   8588.3629  
  1993 1987   1424.1385   3873.730    -8337.1656  11185.4425  
  1993 1988   4778.2556   1600.827      744.3767   8812.1344 *
  1993 1990  -3797.3928   4115.722   -14168.4854   6573.6998  
  1993 1991   3664.8825   6713.530   -13252.3528  20582.1179  
  1993 1992  -4108.9169   5524.881   -18030.9090   9813.0753  
  1993 1993 -26937.2785   5418.226   -40590.5140 -13284.0431 *
---
Signif. codes: `*' confidence band does not cover 0

P-value for pre-test of parallel trends assumption:  0
Control Group:  Never Treated,  Anticipation Periods:  1
Estimation Method:  Doubly Robust
  # part b
  mean(subset(job_displacement_data, group==1992 & year==1992)$income) -
    mean(subset(job_displacement_data, group==1992 & year==1990)$income) - 
    ( mean(subset(job_displacement_data, group==0 & year==1992)$income) -
    mean(subset(job_displacement_data, group==0 & year==1990)$income) )
[1] -19022.55
  # part c
  ant_es <- aggte(ant_res, type="dynamic")
  ggdid(ant_es)

  # part d
  ant_overall <- aggte(ant_res, type="group")
  summary(ant_overall)

Call:
aggte(MP = ant_res, type = "group")

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 


Overall summary of ATT's based on group/cohort aggregation:  
       ATT    Std. Error     [ 95%  Conf. Int.]  
 -7711.634      2327.567  -12273.58   -3149.686 *


Group Effects:
 Group    Estimate Std. Error [95% Simult.  Conf. Band]  
  1986  -5683.6710   5470.628     -17794.15    6426.809  
  1987   -440.4114   6497.956     -14825.12   13944.294  
  1988    508.9529   6006.259     -12787.27   13805.175  
  1990 -14507.7979   4197.887     -23800.78   -5214.819 *
  1991  -1896.8692   8323.324     -20322.44   16528.702  
  1992 -18858.4339   4474.770     -28764.36   -8952.513 *
  1993 -26937.2785   5812.733     -39805.09  -14069.471 *
---
Signif. codes: `*' confidence band does not cover 0

Control Group:  Never Treated,  Anticipation Periods:  1
Estimation Method:  Doubly Robust

Question 3

Now, let’s suppose that we think that parallel trends holds only after we condition on a person sex and race (in reality, you could think of including many other variables in the parallel trends assumption, but let’s just keep it simple). In my view, I think allowing for anticipation is desirable in this setting too, so let’s keep allowing for one year of anticipation.

  1. Answer parts (a), (c), and (d) of Question 1 but including sex and white as covariates.
Solutions:
  # part a
  covs_res <- att_gt(yname="income",
                    tname="year",
                    idname="id",
                    gname="group",
                    data=job_displacement_data,
                    anticipation=1,
                    xformla=~female + white)
Warning in pre_process_did(yname = yname, tname = tname, idname = idname, :
Dropped 26 units that were already treated in the first period.
Warning in pre_process_did(yname = yname, tname = tname, idname = idname, : Be aware that there are some small groups in your dataset.
  Check groups: 1992,1993.
  summary(covs_res)

Call:
att_gt(yname = "income", tname = "year", idname = "id", gname = "group", 
    xformla = ~female + white, data = job_displacement_data, 
    anticipation = 1)

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 

Group-Time Average Treatment Effects:
 Group Time    ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
  1986 1985  -1724.0034   2493.960    -7957.3577   4509.3509  
  1986 1986  -4258.8672   3475.987   -12946.6795   4428.9451  
  1986 1987  -4861.6136   3901.328   -14612.5144   4889.2873  
  1986 1988  -4729.6121   6550.950   -21102.9257  11643.7016  
  1986 1990  -8685.9902   5919.644   -23481.4317   6109.4514  
  1986 1991  -8753.8554   6140.301   -24100.8026   6593.0917  
  1986 1992  -9530.3951   6819.824   -26575.7285   7514.9382  
  1986 1993  -4727.7652   6608.974   -21246.1040  11790.5736  
  1987 1985   4559.7049   5401.196    -8939.9374  18059.3472  
  1987 1986  -8337.6804   4628.947   -19907.1778   3231.8171  
  1987 1987  -1244.4854   7055.388   -18878.5822  16389.6114  
  1987 1988  -4009.1142   7547.463   -22873.0934  14854.8651  
  1987 1990   -483.2506   6510.734   -16756.0494  15789.5483  
  1987 1991    865.8558   7991.793   -19108.6731  20840.3847  
  1987 1992     -1.1369   9485.687   -23709.4748  23707.2010  
  1987 1993   -760.5834   7646.473   -19872.0265  18350.8596  
  1988 1985  -8427.9592   4483.846   -19634.7939   2778.8755  
  1988 1986  -3208.6634   3785.889   -12671.0400   6253.7133  
  1988 1987  -3540.3348   3667.076   -12705.7526   5625.0830  
  1988 1988  -4496.7178   4488.311   -15714.7134   6721.2778  
  1988 1990  -2886.2705   7805.516   -22395.2224  16622.6814  
  1988 1991   3026.1289   6141.168   -12322.9835  18375.2413  
  1988 1992  10422.7498  15019.310   -27116.2158  47961.7153  
  1988 1993  -1710.3233   7388.194   -20176.2295  16755.5830  
  1990 1985  -5423.4224   3557.666   -14315.3831   3468.5383  
  1990 1986   4124.3571   2819.216    -2921.9351  11170.6493  
  1990 1987   6034.5096   3012.839    -1495.7213  13564.7406  
  1990 1988   1473.8450   3277.915    -6718.9116   9666.6016  
  1990 1990  -4087.0904  12272.756   -34761.3721  26587.1913  
  1990 1991 -21451.7077   4889.978   -33673.6224  -9229.7929 *
  1990 1992 -15350.4684   3736.307   -24688.9203  -6012.0165 *
  1990 1993 -16489.8656   6266.352   -32151.8605   -827.8708 *
  1991 1985    787.4357   3522.377    -8016.3237   9591.1950  
  1991 1986  -2463.7125   3696.944   -11703.7810   6776.3560  
  1991 1987  -1271.9440   3335.578    -9608.8221   7064.9341  
  1991 1988  -6698.7830   3813.606   -16230.4339   2832.8679  
  1991 1990   2753.4298   6037.666   -12336.9920  17843.8516  
  1991 1991  -9246.2829   8636.010   -30830.9534  12338.3877  
  1991 1992   4013.8999   8088.141   -16201.4400  24229.2398  
  1991 1993   -162.2495  10086.002   -25371.0029  25046.5038  
  1992 1985 -12170.1207   6394.614   -28152.6913   3812.4499  
  1992 1986  -3584.4939   2410.058    -9608.1443   2439.1566  
  1992 1987   2598.5246   3653.567    -6533.1284  11730.1775  
  1992 1988  -7330.9148   2871.569   -14508.0580   -153.7715 *
  1992 1990   7649.2124   4740.277    -4198.5416  19496.9663  
  1992 1991 -10130.9141   7932.263   -29956.6541   9694.8259  
  1992 1992 -19327.7970   6701.003   -36076.1514  -2579.4426 *
  1992 1993 -19410.4421   8689.023   -41127.6126   2306.7284  
  1993 1985  -7391.9287   5580.167   -21338.8875   6555.0302  
  1993 1986     50.7636   3523.194    -8755.0376   8856.5648  
  1993 1987   1618.3041   3763.007    -7786.8800  11023.4883  
  1993 1988   4453.4544   1769.420       31.0011   8875.9077 *
  1993 1990  -3630.4984   3887.086   -13345.8049   6084.8080  
  1993 1991   3439.7874   6512.448   -12837.2945  19716.8692  
  1993 1992  -4123.7577   5953.866   -19004.7314  10757.2161  
  1993 1993 -27304.4090   5711.372   -41579.2990 -13029.5190 *
---
Signif. codes: `*' confidence band does not cover 0

P-value for pre-test of parallel trends assumption:  0
Control Group:  Never Treated,  Anticipation Periods:  1
Estimation Method:  Doubly Robust
  # part c
  covs_es <- aggte(covs_res, type="dynamic")
  ggdid(covs_es)

  # part d
  covs_overall <- aggte(covs_res, type="group")
  summary(covs_overall)

Call:
aggte(MP = covs_res, type = "group")

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 


Overall summary of ATT's based on group/cohort aggregation:  
       ATT    Std. Error     [ 95%  Conf. Int.]  
 -7931.965      2389.043   -12614.4   -3249.526 *


Group Effects:
 Group    Estimate Std. Error [95% Simult.  Conf. Band]  
  1986  -6506.8712   5271.801     -18120.53    5106.783  
  1987   -938.7858   6869.413     -16071.94   14194.369  
  1988    871.1134   6286.149     -12977.12   14719.352  
  1990 -14344.7830   4317.628     -23856.42   -4833.149 *
  1991  -1798.2108   8271.638     -20020.44   16424.014  
  1992 -19369.1196   4406.584     -29076.72   -9661.517 *
  1993 -27304.4090   5819.843     -40125.39  -14483.431 *
---
Signif. codes: `*' confidence band does not cover 0

Control Group:  Never Treated,  Anticipation Periods:  1
Estimation Method:  Doubly Robust
  1. By default, the did package uses the doubly robust approach that we discussed during our session. How do the results change if you use a regression approach or propensity score re-weighting?
Solutions:

For simplicity, I am just going to show the overall results when using the regression approach and the propensity score re-weighting approach.

  reg_res <- att_gt(yname="income",
                    tname="year",
                    idname="id",
                    gname="group",
                    data=job_displacement_data,
                    anticipation=1,
                    xformla=~female + white,
                    est_method="reg")
Warning in pre_process_did(yname = yname, tname = tname, idname = idname, :
Dropped 26 units that were already treated in the first period.
Warning in pre_process_did(yname = yname, tname = tname, idname = idname, : Be aware that there are some small groups in your dataset.
  Check groups: 1992,1993.
  summary(aggte(reg_res, type="group"))

Call:
aggte(MP = reg_res, type = "group")

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 


Overall summary of ATT's based on group/cohort aggregation:  
       ATT    Std. Error     [ 95%  Conf. Int.]  
 -7919.691      2241.563  -12313.07   -3526.308 *


Group Effects:
 Group    Estimate Std. Error [95% Simult.  Conf. Band]  
  1986  -6434.0559   5552.204     -18399.92    5531.812  
  1987   -912.7508   6601.460     -15139.93   13314.428  
  1988    862.1890   6513.785     -13176.04   14900.415  
  1990 -14343.8838   4120.677     -23224.59   -5463.180 *
  1991  -1796.2167   8395.139     -19889.05   16296.621  
  1992 -19441.0738   4325.936     -28764.14  -10118.005 *
  1993 -27302.1029   5815.805     -39836.07  -14768.134 *
---
Signif. codes: `*' confidence band does not cover 0

Control Group:  Never Treated,  Anticipation Periods:  1
Estimation Method:  Outcome Regression
  ipw_res <- att_gt(yname="income",
                    tname="year",
                    idname="id",
                    gname="group",
                    data=job_displacement_data,
                    anticipation=1,
                    xformla=~female + white,
                    est_method="ipw")
Warning in pre_process_did(yname = yname, tname = tname, idname = idname, : Dropped 26 units that were already treated in the first period.

Warning in pre_process_did(yname = yname, tname = tname, idname = idname, : Be aware that there are some small groups in your dataset.
  Check groups: 1992,1993.
  summary(aggte(ipw_res, type="group"))

Call:
aggte(MP = ipw_res, type = "group")

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 


Overall summary of ATT's based on group/cohort aggregation:  
      ATT    Std. Error     [ 95%  Conf. Int.]  
 -7931.69      2395.857  -12627.48   -3235.896 *


Group Effects:
 Group    Estimate Std. Error [95% Simult.  Conf. Band]  
  1986  -6506.2796   5366.182     -18329.69    5317.130  
  1987   -938.6980   7027.544     -16422.62   14545.222  
  1988    871.1522   5952.535     -12244.18   13986.484  
  1990 -14345.0498   4649.411     -24589.19   -4100.914 *
  1991  -1798.1771   8179.503     -19820.23   16223.875  
  1992 -19368.1395   4545.188     -29382.64   -9353.642 *
  1993 -27303.3746   5786.646     -40053.20  -14553.548 *
---
Signif. codes: `*' confidence band does not cover 0

Control Group:  Never Treated,  Anticipation Periods:  1
Estimation Method:  Inverse Probability Weighting

You can see that the results are very similar across estimation strategies in this example.

Question 4

Finally, the data that we have contains a variable called occ_score which is roughly a variable that measures the occupation “quality”. Suppose that we (i) are interested in including a person’s occupation in the parallel trends assumption, (ii) are satisfied that occ_score sufficiently summarizes a person’s occupation, but (iii) are worried that a person’s occupation is a “bad control” (in the sense that it could be affected by the treatment).

  1. Repeat parts (a), (c), and (d) of Question 1 but including occ_score in the parallel trends assumption. Continue to allow for 1 year of anticipation effects.
Solutions:
  # part a
  occ_res <- att_gt(yname="income",
                    tname="year",
                    idname="id",
                    gname="group",
                    data=job_displacement_data,
                    anticipation=1,
                    xformla=~female + white + occ_score)
Warning in pre_process_did(yname = yname, tname = tname, idname = idname, :
Dropped 26 units that were already treated in the first period.
Warning in pre_process_did(yname = yname, tname = tname, idname = idname, : Be aware that there are some small groups in your dataset.
  Check groups: 1992,1993.
  summary(occ_res)

Call:
att_gt(yname = "income", tname = "year", idname = "id", gname = "group", 
    xformla = ~female + white + occ_score, data = job_displacement_data, 
    anticipation = 1)

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 

Group-Time Average Treatment Effects:
 Group Time    ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
  1986 1985  -2195.2252   2439.588     -8404.432   4013.9820  
  1986 1986  -4943.5726   3602.994    -14113.864   4226.7192  
  1986 1987  -5648.5248   4289.363    -16565.754   5268.7046  
  1986 1988  -5486.8776   6630.855    -22363.640  11389.8853  
  1986 1990  -9355.8673   5982.470    -24582.370   5870.6353  
  1986 1991  -9341.8431   6334.782    -25465.046   6781.3597  
  1986 1992 -10108.9739   6896.816    -27662.657   7444.7096  
  1986 1993  -5529.8989   6878.792    -23037.708  11977.9101  
  1987 1985   3820.8904   5398.740     -9919.911  17561.6916  
  1987 1986  -8340.0613   4527.388    -19863.109   3182.9868  
  1987 1987  -1140.5127   6506.136    -17699.843  15418.8177  
  1987 1988  -3872.3621   7761.166    -23625.978  15881.2537  
  1987 1990   -245.3064   6759.948    -17450.635  16960.0221  
  1987 1991   1163.8056   7799.982    -18688.606  21016.2171  
  1987 1992    357.4786  10043.690    -25205.589  25920.5459  
  1987 1993   -573.4507   7592.304    -19897.283  18750.3811  
  1988 1985  -9335.5672   4430.668    -20612.445   1941.3104  
  1988 1986  -3340.6154   3594.759    -12489.948   5808.7171  
  1988 1987  -3382.3715   3682.071    -12753.930   5989.1874  
  1988 1988  -4249.2023   4417.909    -15493.606   6995.2013  
  1988 1990  -2636.2457   7843.029    -22598.220  17325.7285  
  1988 1991   3600.9662   6373.120    -12619.815  19821.7473  
  1988 1992  10870.4646  14238.511    -25369.205  47110.1341  
  1988 1993  -1193.1813   7759.334    -20942.136  18555.7735  
  1990 1985  -6306.9131   3456.006    -15103.094   2489.2683  
  1990 1986   3619.3463   2639.107     -3097.673  10336.3654  
  1990 1987   6300.9857   3219.508     -1893.264  14495.2354  
  1990 1988   1669.2779   3526.601     -7306.581  10645.1366  
  1990 1990  -3975.3758  11948.397    -34386.276  26435.5247  
  1990 1991 -21181.3377   4652.110    -33021.827  -9340.8486 *
  1990 1992 -15120.4248   3939.587    -25147.410  -5093.4392 *
  1990 1993 -16136.7404   6346.583    -32289.979     16.4980  
  1991 1985    275.2798   3819.124     -9445.103   9995.6625  
  1991 1986  -2972.7479   3680.667    -12340.734   6395.2379  
  1991 1987  -1061.8712   3120.582     -9004.335   6880.5930  
  1991 1988  -6533.7425   4019.775    -16764.820   3697.3347  
  1991 1990   2973.9229   6405.612    -13329.555  19277.4010  
  1991 1991  -8630.5915   8646.878    -30638.511  13377.3280  
  1991 1992   4461.4852   8498.282    -17168.230  26091.2006  
  1991 1993    625.5103  10665.656    -26520.575  27771.5955  
  1992 1985 -11419.6688   6848.246    -28849.734   6010.3965  
  1992 1986  -3525.3900   2823.484    -10711.684   3660.9041  
  1992 1987   2689.5473   3771.550     -6909.751  12288.8460  
  1992 1988  -7336.2075   3054.516    -15110.521    438.1057  
  1992 1990   7673.1535   4706.013     -4304.528  19650.8347  
  1992 1991 -10337.3059   7310.414    -28943.675   8269.0627  
  1992 1992 -19895.1794   7162.624    -38125.395  -1664.9639 *
  1992 1993 -19597.7636   9015.071    -42542.801   3347.2743  
  1993 1985  -7566.2072   4808.026    -19803.530   4671.1158  
  1993 1986     50.1090   3584.093     -9072.077   9172.2953  
  1993 1987   1781.7444   3852.290     -8023.053  11586.5418  
  1993 1988   4377.3771   1839.394      -304.223   9058.9772  
  1993 1990  -3777.5137   4161.498    -14369.302   6814.2747  
  1993 1991   3464.8956   6455.712    -12966.098  19895.8893  
  1993 1992  -4041.1832   5761.972    -18706.477  10624.1110  
  1993 1993 -27091.4909   5916.956    -42151.249 -12031.7333 *
---
Signif. codes: `*' confidence band does not cover 0

P-value for pre-test of parallel trends assumption:  0
Control Group:  Never Treated,  Anticipation Periods:  1
Estimation Method:  Doubly Robust
  # part c
  occ_es <- aggte(occ_res, type="dynamic")
  ggdid(occ_es)

  # part d
  occ_overall <- aggte(occ_res, type="group")
  summary(occ_overall)

Call:
aggte(MP = occ_res, type = "group")

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 


Overall summary of ATT's based on group/cohort aggregation:  
       ATT    Std. Error     [ 95%  Conf. Int.]  
 -7873.709      2280.252  -12342.92   -3404.498 *


Group Effects:
 Group    Estimate Std. Error [95% Simult.  Conf. Band]  
  1986  -7202.2226   5471.742     -19171.48    4767.038  
  1987   -718.3913   6853.787     -15710.83   14274.050  
  1988   1278.5603   6195.673     -12274.28   14831.398  
  1990 -14103.4696   4109.816     -23093.56   -5113.379 *
  1991  -1181.1987   7776.333     -18191.68   15829.282  
  1992 -19746.4715   4435.389     -29448.74  -10044.200 *
  1993 -27091.4909   5931.060     -40065.49  -14117.487 *
---
Signif. codes: `*' confidence band does not cover 0

Control Group:  Never Treated,  Anticipation Periods:  1
Estimation Method:  Doubly Robust
  1. What additional assumptions (with respect to occupation) do you need to make in order to rationalize this approach?