StataとRの比較

Author

Kazuharu Yanagimoto

Published

May 26, 2023

Replication

Simulation

1. Generating Data

Stata

R

2. Diagnosis

Goodman-Bacon (2021)

Stata

R

Jakiela (2021)

Stata

R

Stata

R

3. Estimation Results

Stata

R

Stata

R

Application: Replication of Cicala (2022)

1. Data Overview

Stata

R

2. Diagnosis

Stata

R

Stata

R

3. Estimation Results

Stata

R

Stata

R

Discussion

Benchmark

Simulation

Stata
(mm:ss:mmm)
R
(mm:ss:mmm)
Sun and Abraham 00:04:000 00:00:153
Callway and Sant'Anna 00:11:000 00:01:442
de Chaisemartin and D'Haultfoeuille1 08:29:000 42:08:740
Borusyak, Jaravel, Spiess 00:10:000 00:01:333
Gardner 00:04:000 00:02:939
1 Bootstrap with 100 times. Single Thread.

Application

Stata
(hh:mm:ss)
R
(hh:mm:ss)
Sun and Abraham 06:00:40 00:02:04
Callway and Sant'Anna1 07:15:18 04:16:37
de Chaisemartin and D'Haultfoeuille1 01:22:54 02:18:19
Borusyak, Jaravel, Spiess 00:03:41 00:01:30
Gardner 00:00:19 00:00:05
1 Bootstrap with 50 times. Single Thread.
  • 固定効果入りの回帰分析のためのパッケージである Stata のreghdfe と R の fixest では数倍 ~ 数十倍の速度差がある. fixest:Benchmark
  • TWFEベースの手法 (Sun and Abraham (2021)) や Imputationの手法 (Borusyak, Jaravel, and Spiess (2022), Gardner (2022)) は fixest を利用できるため, Stataと比べてかなり早い
  • Callaway, Goodman-Bacon, and Sant’Anna (2021)de Chaisemartin and D’Haultfœuille (2020) はマルチスレッドで計算できるため, 実践的には数倍早くなる可能性がある. ただし, RのWindows版ではマルチスレッドに対応していなかった.

Stata ↔︎ R

de Chaisemartin and D’Haultfœuille (2020)

  • Rパッケージはここ3年メンテナンスされてなく, Stataのパッケージに追いつけていない模様. kylebutts/did2s#19
  • Stata の did_multiplegtfirstdiff_placeboweight オプションがない. また, Cicalaのreplicationで cluster=pca_modate を指定するとエラーが出る.

Gardner (2022)

  • did2s::did2s はLarge Matrixに対してAnalytical Standard Errorを計算できない. これはパッケージの仕様らしい. kylebutts/did2s#12
  • ただし, bootstrap=True にしてもバグのようなエラーが出る. feols を用いて Stata と同じ実装をした.

References

Borusyak, Kirill, Xavier Jaravel, and Jann Spiess. 2022. “Revisiting Event Study Designs: Robust and Efficient Estimation.” arXiv. https://doi.org/10.48550/arXiv.2108.12419.
Callaway, Brantly, Andrew Goodman-Bacon, and Pedro H. C. Sant’Anna. 2021. “Difference-in-Differences with a Continuous Treatment.” arXiv. https://doi.org/10.48550/arXiv.2107.02637.
Cicala, Steve. 2022. “Imperfect Markets Versus Imperfect Regulation in US Electricity Generation.” American Economic Review 112 (2): 409–41. https://doi.org/10.1257/aer.20172034.
de Chaisemartin, Clément, and Xavier D’Haultfœuille. 2020. “Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects.” American Economic Review 110 (9): 2964–96. https://doi.org/10.1257/aer.20181169.
Gardner, John. 2022. “Two-Stage Differences in Differences.” arXiv. https://doi.org/10.48550/arXiv.2207.05943.
Goodman-Bacon, Andrew. 2021. “Difference-in-Differences with Variation in Treatment Timing.” Journal of Econometrics, Themed Issue: Treatment Effect 1, 225 (2): 254–77. https://doi.org/10.1016/j.jeconom.2021.03.014.
Jakiela, Pamela. 2021. “Simple Diagnostics for Two-Way Fixed Effects.” arXiv. https://doi.org/10.48550/arXiv.2103.13229.
Sun, Liyang, and Sarah Abraham. 2021. “Estimating Dynamic Treatment Effects in Event Studies with Heterogeneous Treatment Effects.” Journal of Econometrics 225 (2): 175–99. https://doi.org/10.1016/j.jeconom.2020.09.006.