zoukankan      html  css  js  c++  java
  • stata固定效应

    对于面板数据,我们有多种估计方法,包括混合OLS、固定效应(FE)、随机效应(RE)和最小二乘虚拟变量(LSDV)等等。不过,我们最为常用的估计方法那自然还是固定效应(组内估计),固定效应模型的Stata官方命令是xtreg,但它有时候其实并没有那么好用(如对数据格式有要求,运行速度慢等),我们经常使用的固定效应估计命令还有regaregreghdfe

    xtreg

    xtreg,fe是固定效应模型的官方命令,使用这一命令估计出来的系数是最为纯正的固定效应估计量(组内估计量)xtreg对数据格式有严格要求,要求必须是面板数据,在使用xtreg命令之前,我们首先需要使用xtset命令进行面板数据声明,定义截面(个体)维度和时间维度。一旦在xtreg命令后加上选项fe,那就表示使用固定效应组内估计方法进行估计,并且默认个体固定效应定义在xtset所设定的截面维度上。至于时间固定效应,需要引入虚拟变量i.year来表示不同的时间。

    下面使用林毅夫老师(1992)的AER论文《Rural Reforms and Agricultural Growth in China》(中国的农村改革与农业增长)所使用的数据lin_1992.dta,给大家演示一下该命令的用法和估计结果。

    . xtset province year
           panel variable:  province (strongly balanced)
            time variable:  year, 70 to 87
                    delta:  1 unit
                    
    . xtreg ltvfo ltlan ltwlab ltpow ltfer hrs mci ngca i.year, fe vce(cluster province)
    
    Fixed-effects (within) regression               Number of obs     =        476
    Group variable: province                        Number of groups  =         28
    
    R-sq:                                           Obs per group:
         within  = 0.8932                                         min =         17
         between = 0.6596                                         avg =       17.0
         overall = 0.7156                                         max =         17
    
                                                    F(23,27)          =     949.82
    corr(u_i, Xb)  = -0.3425                        Prob > F          =     0.0000
    
                                  (Std. Err. adjusted for 28 clusters in province)
    ------------------------------------------------------------------------------
                 |               Robust
           ltvfo |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           ltlan |   .5833594   .1745834     3.34   0.002     .2251439    .9415749
          ltwlab |   .1514909   .0585107     2.59   0.015     .0314368     .271545
           ltpow |   .0971114    .090911     1.07   0.295    -.0894225    .2836453
           ltfer |   .1693346   .0438098     3.87   0.001     .0794444    .2592248
             hrs |   .1503752   .0587581     2.56   0.016     .0298136    .2709368
             mci |   .1978373   .0810587     2.44   0.022     .0315186     .364156
            ngca |   .7784081   .4016301     1.94   0.063    -.0456688    1.602485
                 |
            year |
             71  |  -.0240404    .023366    -1.03   0.313    -.0719836    .0239027
             72  |  -.1323624   .0404832    -3.27   0.003    -.2154272   -.0492977
             73  |  -.0377336   .0357883    -1.05   0.301     -.111165    .0356979
             74  |   .0058554   .0500774     0.12   0.908     -.096895    .1086058
             75  |   .0096731   .0566898     0.17   0.866    -.1066448    .1259911
             76  |  -.0476465    .061423    -0.78   0.445    -.1736761    .0783832
             77  |  -.0869336   .0680579    -1.28   0.212    -.2265767    .0527096
             78  |  -.0325205   .0766428    -0.42   0.675    -.1897785    .1247376
             79  |  -.0076332   .0833462    -0.09   0.928    -.1786454     .163379
             81  |   -.093479   .1093614    -0.85   0.400    -.3178701    .1309121
             82  |  -.0447862   .1207405    -0.37   0.714    -.2925251    .2029528
             83  |  -.0309435   .1377207    -0.22   0.824     -.313523    .2516361
             84  |   .0442535   .1428764     0.31   0.759    -.2489048    .3374117
             85  |  -.0033372   .1561209    -0.02   0.983    -.3236709    .3169965
             86  |     .00484    .157992     0.03   0.976    -.3193329    .3290129
             87  |   .0386475   .1639608     0.24   0.815    -.2977723    .3750674
                 |
           _cons |   2.651286   .7738994     3.43   0.002     1.063376    4.239196
    -------------+----------------------------------------------------------------
         sigma_u |  .29344594
         sigma_e |  .09930555
             rho |  .89724523   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------

    reg

    通过在回归方程中引入虚拟变量来代表不同的个体,可以起到和固定效应组内估计方法(FE)同样的效果(已经被证明)。这种方法被称之为最小二乘虚拟变量方法(LSDV),一些教材和论文也把这种方法称之为固定效应估计方法。它的好处是可以得到对个体异质性[公式]的估计(FE是通过组内变换消去个体异质性[公式]),但如果个体[公式]很大,那么需要引入很多虚拟变量,自由度损失太多,还可能超出Stata所允许的解释变量个数。

    LSDV方法的Stata命令是reg i.id i.year,其中,id是个体变量,year是时间变量,reg命令对数据格式没有要求,因而使用起来更为灵活,只是会生成一大长串虚拟变量估计结果。

    . reg ltvfo ltlan ltwlab ltpow ltfer hrs mci ngca i.province i.year, vce(cluster province)
    
    Linear regression                               Number of obs     =        476
                                                    F(22, 27)         =          .
                                                    Prob > F          =          .
                                                    R-squared         =     0.9695
                                                    Root MSE          =     .09931
    
                                   (Std. Err. adjusted for 28 clusters in province)
    -------------------------------------------------------------------------------
                  |               Robust
            ltvfo |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    --------------+----------------------------------------------------------------
            ltlan |   .5833594   .1800436     3.24   0.003     .2139404    .9527783
           ltwlab |   .1514909   .0603407     2.51   0.018      .027682    .2752998
            ltpow |   .0971114   .0937543     1.04   0.309    -.0952565    .2894792
            ltfer |   .1693346   .0451799     3.75   0.001     .0766331    .2620362
              hrs |   .1503752   .0605958     2.48   0.020      .026043    .2747075
              mci |   .1978373   .0835939     2.37   0.025     .0263169    .3693578
             ngca |   .7784081   .4141914     1.88   0.071    -.0714423    1.628259
                  |
         province |
         beijing  |  -.1865095   .1172887    -1.59   0.123     -.427166     .054147
          fujian  |   .0434646   .0473107     0.92   0.366    -.0536089    .1405381
           gansu  |  -.7945197   .1228202    -6.47   0.000    -1.046526   -.5425134
       guangdong  |  -.0278664   .0609608    -0.46   0.651    -.1529476    .0972149
         guangxi  |  -.2539549   .0614801    -4.13   0.000    -.3801015   -.1278082
         guizhou  |  -.2526439   .0598147    -4.22   0.000    -.3753736   -.1299142
           hebei  |   -.270106   .0948694    -2.85   0.008    -.4647619     -.07545
    heilongjiang  |  -.0926732     .26542    -0.35   0.730      -.63727    .4519237
           henan  |  -.0920743   .0396983    -2.32   0.028    -.1735284   -.0106201
           hubei  |   .1024438   .0368811     2.78   0.010     .0267701    .1781176
           hunan  |  -.0434275   .0581142    -0.75   0.461    -.1626679    .0758129
         jiangsu  |   .1153335   .0352061     3.28   0.003     .0430965    .1875705
         jiangxi  |  -.1401737   .0596644    -2.35   0.026    -.2625949   -.0177525
           jilin  |  -.1783839   .2109985    -0.85   0.405    -.6113171    .2545493
        liaoning  |  -.2517315   .1563399    -1.61   0.119    -.5725145    .0690515
         neimong  |  -.8860432   .2325209    -3.81   0.001    -1.363137   -.4089498
         ningxia  |  -.8489859   .1732579    -4.90   0.000    -1.204482     -.49349
         qinghai  |  -.6982553   .1268849    -5.50   0.000    -.9586017   -.4379089
         shaanxi  |   -.320607   .0887091    -3.61   0.001     -.502623   -.1385911
       shangdong  |   .0040812   .0547494     0.07   0.941    -.1082554    .1164177
        shanghai  |   .0864336   .0982642     0.88   0.387    -.1151878     .288055
          shanxi  |  -.5005347   .1388718    -3.60   0.001     -.785476   -.2155934
         sichuan  |   .0335563   .0392453     0.86   0.400    -.0469685    .1140811
         tianjin  |     -.3011   .1049208    -2.87   0.008    -.5163796   -.0858203
        xinjiang  |  -.3740561   .2053926    -1.82   0.080    -.7954869    .0473746
          yunnan  |  -.2854833   .0590488    -4.83   0.000    -.4066415   -.1643251
        zhejiang  |   .1615248   .0760427     2.12   0.043     .0054981    .3175515
                  |
             year |
              71  |  -.0240404   .0240968    -1.00   0.327     -.073483    .0254022
              72  |  -.1323624   .0417494    -3.17   0.004    -.2180251   -.0466998
              73  |  -.0377336   .0369076    -1.02   0.316    -.1134616    .0379945
              74  |   .0058554   .0516436     0.11   0.911    -.1001086    .1118193
              75  |   .0096731   .0584628     0.17   0.870    -.1102827     .129629
              76  |  -.0476465   .0633441    -0.75   0.458    -.1776178    .0823249
              77  |  -.0869336   .0701864    -1.24   0.226    -.2309442     .057077
              78  |  -.0325205   .0790398    -0.41   0.684    -.1946968    .1296559
              79  |  -.0076332   .0859529    -0.09   0.930    -.1839939    .1687275
              81  |   -.093479   .1127818    -0.83   0.414     -.324888    .1379301
              82  |  -.0447862   .1245167    -0.36   0.722    -.3002733     .210701
              83  |  -.0309435    .142028    -0.22   0.829    -.3223608    .2604739
              84  |   .0442535    .147345     0.30   0.766    -.2580735    .3465804
              85  |  -.0033372   .1610037    -0.02   0.984    -.3336895    .3270151
              86  |     .00484   .1629333     0.03   0.977    -.3294716    .3391516
              87  |   .0386475   .1690888     0.23   0.821    -.3082941    .3855891
                  |
            _cons |   2.874582   .7510459     3.83   0.001     1.333563    4.415601
    -------------------------------------------------------------------------------

    areg

    areg命令是对reg命令的改进和优化,其对数据结构也没有要求。有些时候我们想在回归中控制很多虚拟变量(i.id这种),但又不想生成虚拟变量,不想报告虚拟变量的回归结果,那么就可以使用areg命令,只需在选项absorb()的括号里加入你想要控制的类别变量就好。因此,我们也可以使用areg命令实现固定效应的估计,因为固定效应组内估计与LSDV效果是等价的。

    不过absorb()的括号里只能加一个变量,如果想要估计双向固定效应或是更高维度固定效应,那么就还是要使用使用i.var的方式引入虚拟变量。

    . areg ltvfo ltlan ltwlab ltpow ltfer hrs mci ngca i.year, absorb(province) vce(cluster province)
    
    Linear regression, absorbing indicators         Number of obs     =        476
    Absorbed variable: province                     No. of categories =         28
                                                    F(  23,     27)   =     893.08
                                                    Prob > F          =     0.0000
                                                    R-squared         =     0.9695
                                                    Adj R-squared     =     0.9659
                                                    Root MSE          =     0.0993
    
                                  (Std. Err. adjusted for 28 clusters in province)
    ------------------------------------------------------------------------------
                 |               Robust
           ltvfo |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           ltlan |   .5833594   .1800436     3.24   0.003     .2139404    .9527783
          ltwlab |   .1514909   .0603407     2.51   0.018      .027682    .2752998
           ltpow |   .0971114   .0937543     1.04   0.309    -.0952565    .2894792
           ltfer |   .1693346   .0451799     3.75   0.001     .0766331    .2620362
             hrs |   .1503752   .0605958     2.48   0.020      .026043    .2747075
             mci |   .1978373   .0835939     2.37   0.025     .0263169    .3693578
            ngca |   .7784081   .4141914     1.88   0.071    -.0714423    1.628259
                 |
            year |
             71  |  -.0240404   .0240968    -1.00   0.327     -.073483    .0254022
             72  |  -.1323624   .0417494    -3.17   0.004    -.2180251   -.0466998
             73  |  -.0377336   .0369076    -1.02   0.316    -.1134616    .0379945
             74  |   .0058554   .0516436     0.11   0.911    -.1001086    .1118193
             75  |   .0096731   .0584628     0.17   0.870    -.1102827     .129629
             76  |  -.0476465   .0633441    -0.75   0.458    -.1776178    .0823249
             77  |  -.0869336   .0701864    -1.24   0.226    -.2309442     .057077
             78  |  -.0325205   .0790398    -0.41   0.684    -.1946968    .1296559
             79  |  -.0076332   .0859529    -0.09   0.930    -.1839939    .1687275
             81  |   -.093479   .1127818    -0.83   0.414     -.324888    .1379301
             82  |  -.0447862   .1245167    -0.36   0.722    -.3002733     .210701
             83  |  -.0309435    .142028    -0.22   0.829    -.3223608    .2604739
             84  |   .0442535    .147345     0.30   0.766    -.2580735    .3465804
             85  |  -.0033372   .1610037    -0.02   0.984    -.3336895    .3270151
             86  |     .00484   .1629333     0.03   0.977    -.3294716    .3391516
             87  |   .0386475   .1690888     0.23   0.821    -.3082941    .3855891
                 |
           _cons |   2.651286   .7981036     3.32   0.003     1.013713    4.288859
    ------------------------------------------------------------------------------

    备注:如果出现matsize too small

    set matsize 5000

    reghdfe

    reghdfe 主要用于实现多维固定效应线性回归。有些时候,我们需要控制多个维度(如城市-行业-年度)的固定效应,xtreg等命令也OK,但运行速度会很慢,reghdfe解决的就是这一痛点,其在运行速度方面远远优于xtreg等命令。reghdfe是一个外部命令,作者是Sergio Correia,有关这一命令的更多介绍详见github作者主页(),大家在使用之前需要安装(ssc install reghdfe)。

    reghdfe命令可以包含多维固定效应,只需 absorb (var1,var2,var3,...),不需要使用i.var的方式引入虚拟变量,相比xtreg等命令方便许多,并且不会汇报一大长串虚拟变量回归结果。

    . reghdfe ltvfo ltlan ltwlab ltpow ltfer hrs mci ngca, absorb(year province) vce(cluster province)
    (MWFE estimator converged in 2 iterations)
    
    HDFE Linear regression                            Number of obs   =        476
    Absorbing 2 HDFE groups                           F(   7,     27) =     229.56
    Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                      R-squared       =     0.9695
                                                      Adj R-squared   =     0.9658
                                                      Within R-sq.    =     0.6751
    Number of clusters (province) =         28        Root MSE        =     0.0994
    
                                  (Std. Err. adjusted for 28 clusters in province)
    ------------------------------------------------------------------------------
                 |               Robust
           ltvfo |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           ltlan |   .5833594   .1745834     3.34   0.002     .2251439    .9415749
          ltwlab |   .1514909   .0585107     2.59   0.015     .0314368     .271545
           ltpow |   .0971114    .090911     1.07   0.295    -.0894225    .2836453
           ltfer |   .1693346   .0438098     3.87   0.001     .0794444    .2592248
             hrs |   .1503752   .0587581     2.56   0.016     .0298136    .2709368
             mci |   .1978373   .0810587     2.44   0.022     .0315186     .364156
            ngca |   .7784081   .4016301     1.94   0.063    -.0456688    1.602485
           _cons |   2.625513   .7307092     3.59   0.001     1.126221    4.124804
    ------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    -----------------------------------------------------+
     Absorbed FE | Categories  - Redundant  = Num. Coefs |
    -------------+---------------------------------------|
            year |        17           0          17     |
        province |        28          28           0    *|
    -----------------------------------------------------+
    * = FE nested within cluster; treated as redundant for DoF computation
    eghdfe y x, absorb(id year industry) 可以实现控制多维固定效应
    
    reghdfe y x, absorb(year#industry) 实现控制交乘固定效应
    
    reghdfe也可以同时对标准误进行聚类

    总结

    从表格展示的回归结果可以发现,xtregregaregreghdfe四个命令估计的系数大小是一致的,只是标准误会有略微差异。其中,xtregreghdfe命令估计得到的标准误是一致的,它们背后的估计方法是固定效应,而regareg命令估计得到的标准误是一致的,因为这两个命令背后的估计方法是特殊的混合OLS(LSDV方法)。

  • 相关阅读:
    今天不说技术,说说中国的十二生肖告诉了我们什么?这就是我们的祖先!
    JS函数的原型及对象,对象方法,对象属性的学习
    C#3.0特性之列表对象的赋值更容易
    读本地图像文件,在上面写一些文件,再传到WWW服务器上
    【Visual C++】vs2008/2005正确打开vs2010所创建项目的几种方法
    高级Swing容器(一)
    助你成长为优秀的程序员 杰出的软件工程师、设计师、分析师和架构师
    Root Pane Containers(一)
    【Visual C++】关于无法打开包括文件:“StdAfx.h”或者意外结尾的错误解决方案
    20年工作经验的架构师写给程序员的一封信
  • 原文地址:https://www.cnblogs.com/celine227/p/14903449.html
Copyright © 2011-2022 走看看