zoukankan      html  css  js  c++  java
  • 期末复习--实用回归分析

    期末复习--实用回归分析

    [y=left[egin{array}{c} y_{1} \ y_{2} \ vdots \ y_{n} end{array} ight], X=left[egin{array}{cccc} 1 & x_{11} & cdots & x_{1 p} \ 1 & x_{21} & cdots & x_{2 p} \ vdots & vdots & & vdots \ 1 & x_{n 1} & cdots & x_{n p} end{array} ight], epsilon=left[egin{array}{c} epsilon_{1} \ epsilon_{2} \ vdots \ epsilon_{n} end{array} ight], eta=left[egin{array}{c} eta_{0} \ eta_{1} \ vdots \ eta_{p} end{array} ight] ]

    [oldsymbol{y}=oldsymbol{X} oldsymbol{eta}+varepsilon ]

    [oldsymbol{X}=left(mathbf{1}, oldsymbol{x}_{1}, ldots, oldsymbol{x}_{p} ight)-- n imes(p+1) ]

    [varepsilon=left(varepsilon_{1}, ldots, varepsilon_{n} ight)^{prime} ]

    Gauss-Markov条件:

    [left{egin{array}{l} Eleft(varepsilon_{i} ight)=0, i=1, ldots, n \ operatorname{Cov}left(varepsilon_{i}, varepsilon_{j} ight)=0, i eq j ; quad operatorname{Var}left(varepsilon_{i} ight)=sigma^{2} end{array} ight. ]

    正态性假设:

    [left{egin{array}{l} varepsilon_{i} sim Nleft(0, sigma^{2} ight), i=1, ldots, n \ varepsilon_{1}, ldots, varepsilon_{n} quad ext { 相互独立 } end{array} ight. ]

    LSE

    [Q(oldsymbol{eta})=(oldsymbol{y}-oldsymbol{X} oldsymbol{eta})^{prime}(oldsymbol{y}-oldsymbol{X} oldsymbol{eta}) ]

    [frac{partial Q(oldsymbol{eta})}{partial oldsymbol{eta}}=-oldsymbol{X}^{prime} 2(oldsymbol{y}-oldsymbol{X} oldsymbol{eta})=-2 oldsymbol{X}^{prime}(oldsymbol{y}-oldsymbol{X} oldsymbol{eta})=0 ]

    [hat{oldsymbol{eta}}=left(oldsymbol{X}^{prime} oldsymbol{X} ight)^{-1} oldsymbol{X}^{prime} oldsymbol{y} ]

    [hat{oldsymbol{y}}=oldsymbol{X} hat{oldsymbol{eta}}=oldsymbol{X}left(oldsymbol{X}^{prime} oldsymbol{X} ight)^{-1} oldsymbol{X}^{prime} oldsymbol{y} stackrel{ ext {def}}{=} oldsymbol{H} oldsymbol{y} ]

    (H=X(X'X)^{-1}X' ightarrow H^2=X(X'X)^{-1}X'*X(X'X)^{-1}X'=X(X'X)^{-1}X'=H)

    ((I_n-H)^2=I_n^2-2HI_n+H^2=I_n-H)

    回归系数估计的最大似然法

    (y sim N(Xeta,sigma^2I_n))

    如果正态分布假设满足,

    [oldsymbol{y}=oldsymbol{X} oldsymbol{eta}+varepsilon, quad varepsilon sim Nleft(mathbf{0}, sigma^{2} oldsymbol{I}_{n} ight) ]

    (oldsymbol{y}) 的概率分布为 (: oldsymbol{y} sim Nleft(oldsymbol{X} oldsymbol{eta}, sigma^{2} oldsymbol{I}_{n} ight),) 这是似然函数为

    [L(oldsymbol{eta})=left(2 pi sigma^{2} ight)^{-n / 2} exp left{-frac{1}{2 sigma^{2}}(oldsymbol{y}-oldsymbol{X} oldsymbol{eta})^{prime}(oldsymbol{y}-oldsymbol{X} oldsymbol{eta}) ight} ]

    对其做“-ln”变换,记为 (ell(oldsymbol{eta}),) 可以证明 (oldsymbol{eta}) 的极大似然估计 (hat{eta}_{M L E}) 与最小二 乘估计等价。而误差项的方差 (sigma^{2}) 的极大似然估计

    [hat{sigma}_{M L E}^{2}=frac{1}{n} S S E=frac{1}{n} oldsymbol{e}^{prime} oldsymbol{e} ]

    EX:

    对于多元线性回归模型: (mathrm{Y}=mathrm{X} eta+varepsilon, varepsilon sim mathrm{N}left(0, sigma^{2} mathrm{I}_{n} ight))
    (1).利用极大似然估计求出 (widehat{sigma^{2}});
    (2).求 (mathrm{E}left(widehat{sigma^{2}} ight))

    (1)如果正态分布假设满足,

    [oldsymbol{y}=oldsymbol{X} oldsymbol{eta}+varepsilon, quad varepsilon sim Nleft(mathbf{0}, sigma^{2} oldsymbol{I}_{n} ight) ]

    (oldsymbol{y}) 的概率分布为 (: oldsymbol{y} sim Nleft(oldsymbol{X} oldsymbol{eta}, sigma^{2} oldsymbol{I}_{n} ight),) 这是似然函数为

    [L(oldsymbol{eta})=left(2 pi sigma^{2} ight)^{-n / 2} exp left{-frac{1}{2 sigma^{2}}(oldsymbol{y}-oldsymbol{X} oldsymbol{eta})^{prime}(oldsymbol{y}-oldsymbol{X} oldsymbol{eta}) ight} ]

    对其做“-ln”变换,记为 (ell(oldsymbol{eta}),) 可以证明 (oldsymbol{eta}) 的极大似然估计 (hat{eta}_{M L E}) 与最小二 乘估计等价。

    [hat{oldsymbol{eta}}=left(oldsymbol{X}^{prime} oldsymbol{X} ight)^{-1} oldsymbol{X}^{prime} oldsymbol{y} ]

    而误差项的方差 (sigma^{2}) 的极大似然估计

    [widehat{sigma^{2}}=frac{SSE}{n} ]

    (2)

    [Eleft(hat{sigma}^{2} ight)=left(frac{n-p-1}{n} ight) frac{SSE}{n-p-1} ]

    偏决定系数

    [egin{array}{l} ext { Model1: } y_{i}=eta_{0}+eta_{1} x_{1 i}+eta_{2} x_{2 i}+varepsilon_{i}, i=1, ldots, n \ ext { Model0 : } y_{i}=eta_{0}+eta_{2} x_{2 i}+varepsilon_{i}, i=1, ldots, n end{array} ]

    [r_{y 1 ; 2}^{2}=frac{operatorname{SSE}left(x_{2} ight)-operatorname{SSE}left(x_{1}, x_{2} ight)}{operatorname{SSE}left(x_{2} ight)} stackrel{?}{=} frac{operatorname{SSR}left(x_{1}, x_{2} ight)-operatorname{SSR}left(x_{2} ight)}{operatorname{SSE}left(x_{2} ight)} ]

    当模型中已有 (x_{1}, ldots, x_{j-1}, x_{j+1}, ldots, x_{p}) 时, (y)(x_{j}) 的偏决定系数为:

    [egin{aligned} r_{y j ;-j}^{2} &=frac{operatorname{SSE}left(x_{1}, ldots, x_{j-1}, x_{j+1}, ldots, x_{p} ight)-operatorname{SSE}left(x_{1}, ldots, x_{p} ight)}{operatorname{SSE}left(x_{1}, ldots, x_{j-1}, x_{j+1}, ldots, x_{p} ight)} \ &=frac{S S R-operatorname{SSR}(-j)}{operatorname{SSE}(-j)} end{aligned} ]

    偏相关系数:

    [r_{j k}=frac{S_{j k}}{sqrt{S_{j j} S_{k k}}} ]

    [r_{12 ; 3, ldots, p}=frac{-Delta_{12}}{sqrt{Delta_{11} Delta_{22}}} ]

    [r_{12 ; 3}=frac{r_{12}-r_{13} r_{23}}{sqrt{left(1-r_{13}^{2} ight)left(1-r_{23}^{2} ight)}} ]

  • 相关阅读:
    sparql学习sparql示例、dbpedia在线验证
    中国绿卡
    逾期率的水有多深,你知道吗?
    ICO和区块链区别
    What are the benefits to using anonymous functions instead of named functions for callbacks and parameters in JavaScript event code?
    Link static data in sql source control
    sql data compare
    viewbag
    多态的实际使用
    win10 sedlauncher.exe占用cpu处理
  • 原文地址:https://www.cnblogs.com/zonghanli/p/14247102.html
Copyright © 2011-2022 走看看