二阶段DID及其Stata命令
研究者常常做的事:
(1)要么估计一个双向固定效应(TWFE)模型
其中,是个体固定效应,是时间固定效应,是处理组虚拟变量;
(2)要么估计一个事件研究TWFE模型
其中,是处理期的提前/滞后(距离初始处理期k期)虚拟变量;
有时候,研究者用上述模型的变种,即固定或删除提前/滞后期。然而,无论对哪个模型跑OLS,其估计量均不能完全表示平均处理效应(ATT),并在处理效应存在异质性的情况下,上述OLS估计会导致严重的潜在偏误(Borusyak et. al. (2021); Callaway and Sant’Anna (2020); de Chaisemartin and d’Haultfoeuille (2020); Goodman-Bacon (2020); Sun and Abraham (2020))。
下面,我们利用FWL定理来分析一下这个问题。当我们在估计个体和时间固定效应时,会得到一个余值——通常称为消除时间冲击和固定个体效应后的结果变量,但是,我们也可以产生一个余值或者。为了简化,余值化处理变量就是解释或者时产生的问题,尤其是在处理效应存在异质性时。
Gardner(2021)的文章就是为了解决这个问题。他分别估计和,因此,并不需要余值化处理变量。没有处理时,TWFE模型缩减成一个无处理结果的模型
因此,如果我们可以一致地估计,我们就可以得到无处理结果,并从观测结果中消除它们。对于控制组应该接近于0,对于处理的观测变量应该接近于。那么,对处理变量进行回归就会得到一个处理效应的无偏估计量。类似的逻辑也应用于Borusyak_Jaravel_Spiess(2021)。
因此,二阶段估计量的步骤:(1)首先,用未处理的观测数据来估计和,例如,的子样本。余值化结果;
(2)对或者进行回归,估计处理效应或者。
Stata命令
首先,从github下载安装did2s
net install did2s, from('https://raw.githubusercontent.com/kylebutts/did2s_stata/main/ado/')
* ssc install did2s
did2s命令格式为:
did2s depvar [if] [in] [weight], first_stage(varlist) treat_formula(varlist) treat_var(varname) cluster(varname)
first_stage:第一阶段的公式,可以包括固定效应和协变量,但是不能包括处理变量;
treat_formula:第二阶段,这里应该是处理变量,例如,处理变量、提前/滞后处理变量。
treat_var:这是0/1处理变量
也可以在stata里help did2s来查看详细说明文档。
stata例子:
(1)静态TWFE模型
****************************************************************************
* Static
****************************************************************************
use data/df_het.dta
* Manually (note standard errors are off)
qui reg dep_var i.state i.year if treat == 0, nocons
predict adj, residuals
reg adj i.treat, cluster(state) nocons
* With did2s standard error correction
did2s dep_var, first_stage(i.state i.year) treat_formula(i.treat) treat_var(treat) cluster(state)
Linear regression Number of obs = 31,000
F(1, 39) = 2787.70
Prob > F = 0.0000
R-squared = 0.3776
Root MSE = 1.7506
(Std. Err. adjusted for 40 clusters in state)
------------------------------------------------------------------------------
| Robust
adj | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.treat | 2.380208 .0450809 52.80 0.000 2.289024 2.471393
------------------------------------------------------------------------------
(Std. Err. adjusted for clustering on state)
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.treat | 2.380208 .0614314 38.75 0.000 2.259805 2.500612
------------------------------------------------------------------------------
(2)事件研究的TWFE模型
我们也可以通过改变treat_formula来进行事件研究
use data/df_het.dta
* can not have negatives in factor variable
gen rel_year_shift = rel_year + 20
replace rel_year_shift = 100 if rel_year_shift == .
did2s dep_var, first_stage(i.state i.year) treat_formula(ib100.rel_year_shift) treat_var(treat) cluster(state)
(11,408 missing values generated)
(11,408 real changes made)
(Std. Err. adjusted for clustering on state)
--------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------------+----------------------------------------------------------------
rel_year_shift |
0 | .0746601 .0839355 0.89 0.374 -.0898505 .2391707
1 | .155387 .0793007 1.96 0.050 -.0000395 .3108135
2 | .0433077 .0861965 0.50 0.615 -.1256343 .2122497
3 | .0801822 .0804671 1.00 0.319 -.0775303 .2378948
4 | .1027144 .088289 1.16 0.245 -.0703289 .2757576
5 | .2168214 .0947375 2.29 0.022 .0311394 .4025035
6 | .1711757 .0839522 2.04 0.041 .0066325 .3357189
7 | .0943497 .0806924 1.17 0.242 -.0638044 .2525039
8 | .0902389 .0839479 1.07 0.282 -.074296 .2547738
9 | .1980108 .0799579 2.48 0.013 .0412963 .3547253
10 | .1079317 .0650773 1.66 0.097 -.0196175 .235481
11 | .0512958 .0586111 0.88 0.381 -.0635799 .1661715
12 | .0877925 .0403538 2.18 0.030 .0087006 .1668845
13 | .1542725 .0439659 3.51 0.000 .0681009 .2404441
14 | .0221227 .0509763 0.43 0.664 -.077789 .1220343
15 | .0351602 .0489502 0.72 0.473 -.0607804 .1311009
16 | -.0508045 .0504791 -1.01 0.314 -.1497417 .0481326
17 | -.0093709 .049563 -0.19 0.850 -.1065126 .0877707
18 | .008913 .0564742 0.16 0.875 -.1017744 .1196004
19 | .1179371 .0515514 2.29 0.022 .0168982 .2189759
20 | 1.72709 .0827019 20.88 0.000 1.564997 1.889183
21 | 1.752237 .0798446 21.95 0.000 1.595744 1.908729
22 | 1.871322 .0929648 20.13 0.000 1.689114 2.053529
23 | 1.918404 .0755407 25.40 0.000 1.770347 2.066461
24 | 1.939901 .0841578 23.05 0.000 1.774955 2.104847
25 | 2.145896 .0846846 25.34 0.000 1.979917 2.311874
26 | 2.180405 .0920294 23.69 0.000 2.000031 2.360779
27 | 2.347653 .0818133 28.70 0.000 2.187302 2.508004
28 | 2.413051 .0764681 31.56 0.000 2.263176 2.562925
29 | 2.619696 .10755 24.36 0.000 2.408901 2.83049
30 | 2.681013 .0954122 28.10 0.000 2.494008 2.868017
31 | 2.712357 .1203332 22.54 0.000 2.476509 2.948206
32 | 2.671891 .1532795 17.43 0.000 2.371469 2.972314
33 | 2.65582 .1224423 21.69 0.000 2.415837 2.895802
34 | 2.754776 .1293034 21.30 0.000 2.501346 3.008206
35 | 2.823113 .1341072 21.05 0.000 2.560267 3.085958
36 | 2.693967 .1199888 22.45 0.000 2.458793 2.929141
37 | 2.896505 .1265275 22.89 0.000 2.648515 3.144494
38 | 3.130011 .1160092 26.98 0.000 2.902638 3.357385
39 | 3.23059 .1235021 26.16 0.000 2.98853 3.472649
40 | 3.307945 .1119849 29.54 0.000 3.088458 3.527431
--------------------------------------------------------------------------------
这种方法还可以回归前定协变量:
********************************************************************************
* Castle Doctrine
********************************************************************************
use https://github.com/scunning1975/mixtape/raw/master/castle.dta, clear
* Define Covariates
global demo blackm_15_24 whitem_15_24 blackm_25_44 whitem_25_44
* No Covariates
did2s l_homicide [aweight=popwt], first_stage(i.sid i.year) treat_formula(i.post) treat_var(post) cluster(sid)
* Covariates
did2s l_homicide [aweight=popwt], first_stage(i.sid i.year $demo) treat_formula(i.post) treat_var(post) cluster(sid)
(Std. Err. adjusted for clustering on sid)
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.post | .1454483 .1012669 1.44 0.151 -.0530313 .3439279
------------------------------------------------------------------------------
(Std. Err. adjusted for clustering on sid)
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.post | .0802279 .0540177 1.49 0.137 -.0256447 .1861006
------------------------------------------------------------------------------
参考文献
Gardner, John. 2021. “Two-Stage Difference-in-Differences.” Working Paper. https://jrgcmu.github.io/2sdd_current.pdf.
Two-Stage-Difference-in-Differences, https://kylebutts.com/did2s/articles/Two-Stage-Difference-in-Differences.html
Two Stage DiD and Taming the DiD Revolution,https://causalinf.substack.com/p/two-stage-did-and-taming-the-did
R包可以参见https://kylebutts.com/did2s/articles/Two-Stage-Difference-in-Differences.html