二阶段DID及其Stata命令 / 四六文摘

研究者常常做的事：

（1）要么估计一个双向固定效应（TWFE）模型

其中，是个体固定效应，是时间固定效应，是处理组虚拟变量；

（2）要么估计一个事件研究TWFE模型

其中，是处理期的提前/滞后（距离初始处理期k期）虚拟变量；

有时候，研究者用上述模型的变种，即固定或删除提前/滞后期。然而，无论对哪个模型跑OLS，其估计量均不能完全表示平均处理效应（ATT），并在处理效应存在异质性的情况下，上述OLS估计会导致严重的潜在偏误（Borusyak et. al. (2021); Callaway and Sant’Anna (2020); de Chaisemartin and d’Haultfoeuille (2020); Goodman-Bacon (2020); Sun and Abraham (2020)）。

下面，我们利用FWL定理来分析一下这个问题。当我们在估计个体和时间固定效应时，会得到一个余值——通常称为消除时间冲击和固定个体效应后的结果变量，但是，我们也可以产生一个余值或者。为了简化，余值化处理变量就是解释或者时产生的问题，尤其是在处理效应存在异质性时。

Gardner(2021)的文章就是为了解决这个问题。他分别估计和，因此，并不需要余值化处理变量。没有处理时，TWFE模型缩减成一个无处理结果的模型

因此，如果我们可以一致地估计，我们就可以得到无处理结果，并从观测结果中消除它们。对于控制组应该接近于0，对于处理的观测变量应该接近于。那么，对处理变量进行回归就会得到一个处理效应的无偏估计量。类似的逻辑也应用于Borusyak_Jaravel_Spiess（2021）。

因此，二阶段估计量的步骤：（1）首先，用未处理的观测数据来估计和，例如，的子样本。余值化结果；

（2）对或者进行回归，估计处理效应或者。

Stata命令

首先，从github下载安装did2s

net install did2s, from('https://raw.githubusercontent.com/kylebutts/did2s_stata/main/ado/')* ssc install did2s

did2s命令格式为：

did2s depvar [if] [in] [weight], first_stage(varlist) treat_formula(varlist) treat_var(varname) cluster(varname)

first_stage：第一阶段的公式，可以包括固定效应和协变量，但是不能包括处理变量；

treat_formula：第二阶段，这里应该是处理变量，例如，处理变量、提前/滞后处理变量。

treat_var：这是0/1处理变量

也可以在stata里help did2s来查看详细说明文档。

stata例子：

（1）静态TWFE模型

****************************************************************************

* Static

****************************************************************************

use data/df_het.dta

* Manually (note standard errors are off)

qui reg dep_var i.state i.year if treat == 0, nocons

predict adj, residuals

reg adj i.treat, cluster(state) nocons

* With did2s standard error correction

did2s dep_var, first_stage(i.state i.year) treat_formula(i.treat) treat_var(treat) cluster(state)

Linear regression Number of obs = 31,000

F(1, 39) = 2787.70

Prob > F = 0.0000

R-squared = 0.3776

Root MSE = 1.7506

(Std. Err. adjusted for 40 clusters in state)

------------------------------------------------------------------------------

| Robust

adj | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

1.treat | 2.380208 .0450809 52.80 0.000 2.289024 2.471393

------------------------------------------------------------------------------

(Std. Err. adjusted for clustering on state)

------------------------------------------------------------------------------

| Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

1.treat | 2.380208 .0614314 38.75 0.000 2.259805 2.500612

------------------------------------------------------------------------------

（2）事件研究的TWFE模型

我们也可以通过改变treat_formula来进行事件研究

use data/df_het.dta

* can not have negatives in factor variable

gen rel_year_shift = rel_year + 20

replace rel_year_shift = 100 if rel_year_shift == .

did2s dep_var, first_stage(i.state i.year) treat_formula(ib100.rel_year_shift) treat_var(treat) cluster(state)

(11,408 missing values generated)

(11,408 real changes made)

(Std. Err. adjusted for clustering on state)

--------------------------------------------------------------------------------

| Coef. Std. Err. z P>|z| [95% Conf. Interval]

---------------+----------------------------------------------------------------

rel_year_shift |

0 | .0746601 .0839355 0.89 0.374 -.0898505 .2391707

1 | .155387 .0793007 1.96 0.050 -.0000395 .3108135

2 | .0433077 .0861965 0.50 0.615 -.1256343 .2122497

3 | .0801822 .0804671 1.00 0.319 -.0775303 .2378948

4 | .1027144 .088289 1.16 0.245 -.0703289 .2757576

5 | .2168214 .0947375 2.29 0.022 .0311394 .4025035

6 | .1711757 .0839522 2.04 0.041 .0066325 .3357189

7 | .0943497 .0806924 1.17 0.242 -.0638044 .2525039

8 | .0902389 .0839479 1.07 0.282 -.074296 .2547738

9 | .1980108 .0799579 2.48 0.013 .0412963 .3547253

10 | .1079317 .0650773 1.66 0.097 -.0196175 .235481

11 | .0512958 .0586111 0.88 0.381 -.0635799 .1661715

12 | .0877925 .0403538 2.18 0.030 .0087006 .1668845

13 | .1542725 .0439659 3.51 0.000 .0681009 .2404441

14 | .0221227 .0509763 0.43 0.664 -.077789 .1220343

15 | .0351602 .0489502 0.72 0.473 -.0607804 .1311009

16 | -.0508045 .0504791 -1.01 0.314 -.1497417 .0481326

17 | -.0093709 .049563 -0.19 0.850 -.1065126 .0877707

18 | .008913 .0564742 0.16 0.875 -.1017744 .1196004

19 | .1179371 .0515514 2.29 0.022 .0168982 .2189759

20 | 1.72709 .0827019 20.88 0.000 1.564997 1.889183

21 | 1.752237 .0798446 21.95 0.000 1.595744 1.908729

22 | 1.871322 .0929648 20.13 0.000 1.689114 2.053529

23 | 1.918404 .0755407 25.40 0.000 1.770347 2.066461

24 | 1.939901 .0841578 23.05 0.000 1.774955 2.104847

25 | 2.145896 .0846846 25.34 0.000 1.979917 2.311874

26 | 2.180405 .0920294 23.69 0.000 2.000031 2.360779

27 | 2.347653 .0818133 28.70 0.000 2.187302 2.508004

28 | 2.413051 .0764681 31.56 0.000 2.263176 2.562925

29 | 2.619696 .10755 24.36 0.000 2.408901 2.83049

30 | 2.681013 .0954122 28.10 0.000 2.494008 2.868017

31 | 2.712357 .1203332 22.54 0.000 2.476509 2.948206

32 | 2.671891 .1532795 17.43 0.000 2.371469 2.972314

33 | 2.65582 .1224423 21.69 0.000 2.415837 2.895802

34 | 2.754776 .1293034 21.30 0.000 2.501346 3.008206

35 | 2.823113 .1341072 21.05 0.000 2.560267 3.085958

36 | 2.693967 .1199888 22.45 0.000 2.458793 2.929141

37 | 2.896505 .1265275 22.89 0.000 2.648515 3.144494

38 | 3.130011 .1160092 26.98 0.000 2.902638 3.357385

39 | 3.23059 .1235021 26.16 0.000 2.98853 3.472649

40 | 3.307945 .1119849 29.54 0.000 3.088458 3.527431

--------------------------------------------------------------------------------

这种方法还可以回归前定协变量：

********************************************************************************

* Castle Doctrine

********************************************************************************

use https://github.com/scunning1975/mixtape/raw/master/castle.dta, clear

* Define Covariates

global demo blackm_15_24 whitem_15_24 blackm_25_44 whitem_25_44

* No Covariates

did2s l_homicide [aweight=popwt], first_stage(i.sid i.year) treat_formula(i.post) treat_var(post) cluster(sid)

* Covariates

did2s l_homicide [aweight=popwt], first_stage(i.sid i.year $demo) treat_formula(i.post) treat_var(post) cluster(sid)

(Std. Err. adjusted for clustering on sid)

------------------------------------------------------------------------------

| Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

1.post | .1454483 .1012669 1.44 0.151 -.0530313 .3439279

------------------------------------------------------------------------------

(Std. Err. adjusted for clustering on sid)

------------------------------------------------------------------------------

| Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

1.post | .0802279 .0540177 1.49 0.137 -.0256447 .1861006

------------------------------------------------------------------------------

参考文献

Gardner, John. 2021. “Two-Stage Difference-in-Differences.” Working Paper. https://jrgcmu.github.io/2sdd_current.pdf.
Two-Stage-Difference-in-Differences， https://kylebutts.com/did2s/articles/Two-Stage-Difference-in-Differences.html
Two Stage DiD and Taming the DiD Revolution，https://causalinf.substack.com/p/two-stage-did-and-taming-the-did

R包可以参见https://kylebutts.com/did2s/articles/Two-Stage-Difference-in-Differences.html

二阶段DID及其Stata命令

Two Stage DiD and Taming the DiD Revolution，https://causalinf.substack.com/p/two-stage-did-and-taming-the-did

相关推荐