多时点DID(Staggered DID)真的有效吗?

利用政策冲击构造双重差分(Difference-in-Difference, DID)模型是实证会计与金融研究的常用方法。多时点DID(或称为渐进DID, Staggered DID)是传统双重差分方法的拓展。较之于传统DID方法中政策实施时点均一的特征,多时点DID适用于同一政策在影响群体中的渐进实施(如不同省份在不同时间点实施同一政策,Staggered adoption),已经在过去20年间被广泛使用。根据Baker et al.(2021)的统计,国际Top 5的会计或金融期刊在2000~2019年间共计刊发或录用了751篇采用DID方法的论文,其中49%的文章使用的是多时点DID。

以往大部分学者认为,较之于传统DID,多时点DID更能够规避潜在的混淆处理效应(Treatment effect)的同期趋势(Contemporaneous trends),是更稳健的研究方法。然而近一年来,大量计量经济学者指出,多时点DID极易产生无效的估计(如Callaway and Sant’Anna, 2020;de Chaisemartin and D’Haultfoeuille, 2020;Goodman-Bacon, 2020)。近日,美国斯坦福大学和哈佛大学的三位会计与金融学者,Andrew C. Baker、David F. Larcker、Charles C.Y. Wang,总结了已有计量经济学者的观点,并就多时点DID方法缺陷导致会计金融实证研究推断偏误的议题撰写了工作论文“How Much Should We Trust Staggered Difference-In-Differences Estimates?”(JFE Revise&Resubmit)*。本文将提炼该文主要观点,以飨读者

多时点DID为何易导致研究推断偏误?其核心原因在于,多时点DID的估计本质是多个不同处理效应的加权平均,而不同时点的处理组与控制组是变化的、有着不同甚至相反的处理效应,在加权平均后将显著改变DID估计。本文借用Baker et al.(2021)的图对此予以阐释**。假设政策渐进实施于两个时点:tk与tl,样本共有三组群体:始终未受影响的群体(U)、先受到政策影响的群体(k)、以及后受到政策影响的群体(l),那么政策渐进实施影响的效应如图1所示。

图1   Staggered Treatment Setting Example (three groups)

该多时点DID场景实际上可以拆分为4个2×2(pre-post, treat-control)的传统DID场景,如图2所示。而多时点DID的估计本质上即为这4个传统DID估计的加权平均。

图2   Four Simple (2×2) DID from the Three Group Case

从图1与图2的比较中,我们可以获得若干重要的启示:

1) 多时点DID的估计本质上是多个传统2×2 DID估计的加权平均,其权重是子样本(每个传统2×2 DID的样本)规模、处理组与控制组相对规模、以及子样本处理时点的函数。

2) 已经被处理的观测还能作为控制组(图2子图D),即便他们本身并不是控制组!

3) 即便政策渐进实施的过程中处理效应是恒定的(即处理效应并不随时点变化而衰减或增强),但样本跨度这一因素就足以显著改变多时点DID的估计,因为样本跨度的变化会改变每个子样本的规模。

4) 处于中期(如两次政策实施时点之间)的处理组会被赋予更高的权重。

以上4点启示表明,多时点DID方法极易产生研究偏误,尤其是在政策渐进实施过程中处理效应变化的情况下(Heterogeneous treatment effect)。Baker et al.(2021)通过数据拟合的方法证明了该偏误,并表明该偏误甚至可以翻转真实处理效应的方向。不仅如此,该文作者还在更正偏误的情况下,重新复制了三篇极具影响力的多时点DID论文,包括检验银行放松管制的Beck et al. (JF 2010),检验全球董事会治理改革的Fauver et al. (JFE 2017),以及放开公开市场股份回购管制的Wang et al. (JFE 2021)。Baker et al.(2021)发现,在更正多时点DID方法的偏误后,上述三篇论文的研究发现几乎都难以再现。

如何诊断与解决多时点DID中的偏误?Baker et al.(2021)介绍了多种方法,包括Goodman-Bacon诊断法、蕴含重新赋权思想的Callaway and Sant’Anna估计量、堆栈回归(Stacked regression)等,并在最后给学者们提供了8条关于采用多时点DID的建议。限于篇幅,本文不再展开介绍。但值得一提的是,de Chaisemartin and D’Haultfoeuille(AER 2020)开发的STATA命令包“did_multiplegt”,以及Callaway and Sant’Anna(2020)开发的R命令包“did”,都为实证研究学者解决该问题提供了便捷路径。

推文作者注:

* 有趣的是,该文论文题目有致敬(模仿)Bertrand et al(QJE 2004)的经典论文“How Much Should We Trust Differences-In-Differences Estimates”之意。

** 该组示例图最早源于Goodman-Bacon(2020)。

How Much Should We Trust Staggered Difference-In-Differences Estimates?

Andrew C. Baker

David F. Larcker

Charles C.Y. Wang

Revise and Resubmit at Journal of Financial Economics (updated in March 2021)

(Submission information is disclosed by Andrew C. Baker)

Abstract: Difference-in-differences analysis with staggered treatment timing is frequently used to assess the impact of policy changes on corporate outcomes in academic research. However, recent advances in econometric theory show that such designs are likely to be biased in the presence of treatment effect heterogeneity. Given the pronounced use of staggered treatment designs in applied corporate finance and accounting research, this finding potentially impacts a large swath of prior findings in these fields. We survey the nascent literature and document how and when such bias arises from treatment effect heterogeneity. We apply recently proposed methods to a set of prior published results, and find that correcting for the bias induced by the staggered nature of policy adoption frequently impacts the estimated effect from standard difference-indifference studies. In many cases, the reported effects in prior research become indistinguishable from zero.

推文参考文献:

Beck, T., R. Levine, and A. Levkov (2010). Big Bad Banks? The Winners and Losers from Bank Deregulation in the United States. Journal of Finance.

Bertrand, M., E. Duflo, and S. Mulainathan (2004). How Much Should We Trust Differences-In-Differences Estimates?. Quarterly Journal of Economics.

Callaway, B. and P. H. Sant'Anna (2020). Difference-in-Differences with Multiple Time Periods. Journal of Econometrics.

de Chaisemartin, C. and X. D'Haultfoeuille (2020). Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects. American Economic Review.

Fauver, L., M. Hung, X. Li, and A. G. Taboada (2017). Board Reforms and Firm Value: Worldwide Evidence. Journal of Financial Economics.

Goodman-Bacon, A. (2020). Difference-in-Differences with Variation in Treatment Timing. NBER Working Paper.

Wang, Z., Q. E. Yin, and L. Yu (2021). Real Effects of Share Repurchases Legalization on Corporate Behaviors. Journal of Financial Economics.

(0)

相关推荐