An integrated moments test for the conditional parallel trends assumption holding in all pre-treatment time periods for all groups
conditional_did_pretest( yname, tname, idname = NULL, gname, xformla = NULL, data, panel = TRUE, allow_unbalanced_panel = FALSE, control_group = c("nevertreated", "notyettreated"), weightsname = NULL, alp = 0.05, bstrap = TRUE, cband = TRUE, biters = 1000, clustervars = NULL, est_method = "ipw", print_details = FALSE, pl = FALSE, cores = 1 )
The name of the outcome variable
The name of the column containing the time periods
The individual (cross-sectional unit) id name
The name of the variable in
contains the first period when a particular observation is treated.
This should be a positive number for all observations in treated groups.
It defines which "group" a unit belongs to. It should be 0 for units
in the untreated group.
A formula for the covariates to include in the
model. It should be of the form
~ X1 + X2. Default
is NULL which is equivalent to
xformla=~1. This is
used to create a matrix of covariates which is then passed
to the 2x2 DID estimator chosen in
The name of the data.frame that contains the data
Whether or not the data is a panel dataset.
The panel dataset should be provided in long format -- that
is, where each row corresponds to a unit observed at a
particular point in time. The default is TRUE. When
is using a panel dataset, the variable
be set. When
panel=FALSE, the data is treated
as repeated cross sections.
Whether or not function should
"balance" the panel with respect to time and id. The default
FALSE which means that
att_gt() will drop
all units where data is not observed in all periods.
The advantage of this is that the computations are faster
Which units to use the control group.
The default is "nevertreated" which sets the control group
to be the group of units that never participate in the
treatment. This group does not change across groups or
time periods. The other option is to set
group="notyettreated". In this case, the control group
is set to the group of units that have not yet participated
in the treatment in that time period. This includes all
never treated units, but it includes additional units that
eventually participate in the treatment, but have not
The name of the column containing the sampling weights. If not set, all observations have same weight.
the significance level, default is 0.05
Boolean for whether or not to compute standard errors using
the multiplier bootstrap. If standard errors are clustered, then one
bstrap=TRUE. Default is
TRUE (in addition, cband
is also by default
TRUE indicating that uniform confidence bands
will be returned. If bstrap is
FALSE, then analytical
standard errors are reported.
Boolean for whether or not to compute a uniform confidence
band that covers all of the group-time average treatment effects
with fixed probability
1-alp. In order to compute uniform confidence
bstrap must also be set to
TRUE. The default is
The number of bootstrap iterations to use. The default is 1000,
and this is only applicable if
A vector of variables names to cluster on. At most, there
can be two variables (otherwise will throw an error) and one of these
must be the same as idname which allows for clustering at the individual
level. By default, we cluster at individual level (when
the method to compute group-time average treatment effects. The default is "dr" which uses the doubly robust
approach in the
DRDID package. Other built-in methods
include "ipw" for inverse probability weighting and "reg" for
first step regression estimators. The user can also pass their
own function for estimating group time average treatment
effects. This should be a function
Y1 is an
1 vector of outcomes in the post-treatment
Y0 is an
1 vector of
treat is a vector indicating
whether or not an individual participates in the treatment,
covariates is an
k matrix of
covariates. The function should return a list that includes
ATT (an estimated average treatment effect), and
1 influence function).
The function can return other things as well, but these are
the only two that are required.
est_method is only used
if covariates are included.
Whether or not to show details/progress of computations.
Whether or not to use parallel processing
The number of cores to use for parallel processing
Callaway, Brantly and Sant'Anna, Pedro H. C. "Difference-in-Differences with Multiple Time Periods and an Application on the Minimum Wage and Employment." Working Paper https://arxiv.org/abs/1803.09015v2 (2018).