期末复习--实用回归分析
[y=left[egin{array}{c}
y_{1} \
y_{2} \
vdots \
y_{n}
end{array}
ight], X=left[egin{array}{cccc}
1 & x_{11} & cdots & x_{1 p} \
1 & x_{21} & cdots & x_{2 p} \
vdots & vdots & & vdots \
1 & x_{n 1} & cdots & x_{n p}
end{array}
ight], epsilon=left[egin{array}{c}
epsilon_{1} \
epsilon_{2} \
vdots \
epsilon_{n}
end{array}
ight], eta=left[egin{array}{c}
eta_{0} \
eta_{1} \
vdots \
eta_{p}
end{array}
ight]
]
[oldsymbol{y}=oldsymbol{X} oldsymbol{eta}+varepsilon
]
[oldsymbol{X}=left(mathbf{1}, oldsymbol{x}_{1}, ldots, oldsymbol{x}_{p}
ight)--
n imes(p+1)
]
[varepsilon=left(varepsilon_{1}, ldots, varepsilon_{n}
ight)^{prime}
]
Gauss-Markov条件:
[left{egin{array}{l}
Eleft(varepsilon_{i}
ight)=0, i=1, ldots, n \
operatorname{Cov}left(varepsilon_{i}, varepsilon_{j}
ight)=0, i
eq j ; quad operatorname{Var}left(varepsilon_{i}
ight)=sigma^{2}
end{array}
ight.
]
正态性假设:
[left{egin{array}{l}
varepsilon_{i} sim Nleft(0, sigma^{2}
ight), i=1, ldots, n \
varepsilon_{1}, ldots, varepsilon_{n} quad ext { 相互独立 }
end{array}
ight.
]
LSE
[Q(oldsymbol{eta})=(oldsymbol{y}-oldsymbol{X} oldsymbol{eta})^{prime}(oldsymbol{y}-oldsymbol{X} oldsymbol{eta})
]
[frac{partial Q(oldsymbol{eta})}{partial oldsymbol{eta}}=-oldsymbol{X}^{prime} 2(oldsymbol{y}-oldsymbol{X} oldsymbol{eta})=-2 oldsymbol{X}^{prime}(oldsymbol{y}-oldsymbol{X} oldsymbol{eta})=0
]
[hat{oldsymbol{eta}}=left(oldsymbol{X}^{prime} oldsymbol{X}
ight)^{-1} oldsymbol{X}^{prime} oldsymbol{y}
]
[hat{oldsymbol{y}}=oldsymbol{X} hat{oldsymbol{eta}}=oldsymbol{X}left(oldsymbol{X}^{prime} oldsymbol{X}
ight)^{-1} oldsymbol{X}^{prime} oldsymbol{y} stackrel{ ext {def}}{=} oldsymbol{H} oldsymbol{y}
]
(H=X(X'X)^{-1}X'
ightarrow H^2=X(X'X)^{-1}X'*X(X'X)^{-1}X'=X(X'X)^{-1}X'=H)
((I_n-H)^2=I_n^2-2HI_n+H^2=I_n-H)
回归系数估计的最大似然法
(y sim N(Xeta,sigma^2I_n))
如果正态分布假设满足,
[oldsymbol{y}=oldsymbol{X} oldsymbol{eta}+varepsilon, quad varepsilon sim Nleft(mathbf{0}, sigma^{2} oldsymbol{I}_{n}
ight)
]
则 (oldsymbol{y}) 的概率分布为 (: oldsymbol{y} sim Nleft(oldsymbol{X} oldsymbol{eta}, sigma^{2} oldsymbol{I}_{n}
ight),) 这是似然函数为
[L(oldsymbol{eta})=left(2 pi sigma^{2}
ight)^{-n / 2} exp left{-frac{1}{2 sigma^{2}}(oldsymbol{y}-oldsymbol{X} oldsymbol{eta})^{prime}(oldsymbol{y}-oldsymbol{X} oldsymbol{eta})
ight}
]
对其做“-ln”变换,记为 (ell(oldsymbol{eta}),) 可以证明 (oldsymbol{eta}) 的极大似然估计 (hat{eta}_{M L E}) 与最小二 乘估计等价。而误差项的方差 (sigma^{2}) 的极大似然估计
[hat{sigma}_{M L E}^{2}=frac{1}{n} S S E=frac{1}{n} oldsymbol{e}^{prime} oldsymbol{e}
]
EX:
对于多元线性回归模型: (mathrm{Y}=mathrm{X} eta+varepsilon, varepsilon sim mathrm{N}left(0, sigma^{2} mathrm{I}_{n}
ight))
(1).利用极大似然估计求出 (widehat{sigma^{2}});
(2).求 (mathrm{E}left(widehat{sigma^{2}}
ight))
(1)如果正态分布假设满足,
[oldsymbol{y}=oldsymbol{X} oldsymbol{eta}+varepsilon, quad varepsilon sim Nleft(mathbf{0}, sigma^{2} oldsymbol{I}_{n}
ight)
]
则 (oldsymbol{y}) 的概率分布为 (: oldsymbol{y} sim Nleft(oldsymbol{X} oldsymbol{eta}, sigma^{2} oldsymbol{I}_{n}
ight),) 这是似然函数为
[L(oldsymbol{eta})=left(2 pi sigma^{2}
ight)^{-n / 2} exp left{-frac{1}{2 sigma^{2}}(oldsymbol{y}-oldsymbol{X} oldsymbol{eta})^{prime}(oldsymbol{y}-oldsymbol{X} oldsymbol{eta})
ight}
]
对其做“-ln”变换,记为 (ell(oldsymbol{eta}),) 可以证明 (oldsymbol{eta}) 的极大似然估计 (hat{eta}_{M L E}) 与最小二 乘估计等价。
[hat{oldsymbol{eta}}=left(oldsymbol{X}^{prime} oldsymbol{X}
ight)^{-1} oldsymbol{X}^{prime} oldsymbol{y}
]
而误差项的方差 (sigma^{2}) 的极大似然估计
[widehat{sigma^{2}}=frac{SSE}{n}
]
(2)
[Eleft(hat{sigma}^{2}
ight)=left(frac{n-p-1}{n}
ight) frac{SSE}{n-p-1}
]
偏决定系数
[egin{array}{l}
ext { Model1: } y_{i}=eta_{0}+eta_{1} x_{1 i}+eta_{2} x_{2 i}+varepsilon_{i}, i=1, ldots, n \
ext { Model0 : } y_{i}=eta_{0}+eta_{2} x_{2 i}+varepsilon_{i}, i=1, ldots, n
end{array}
]
[r_{y 1 ; 2}^{2}=frac{operatorname{SSE}left(x_{2}
ight)-operatorname{SSE}left(x_{1}, x_{2}
ight)}{operatorname{SSE}left(x_{2}
ight)} stackrel{?}{=} frac{operatorname{SSR}left(x_{1}, x_{2}
ight)-operatorname{SSR}left(x_{2}
ight)}{operatorname{SSE}left(x_{2}
ight)}
]
当模型中已有 (x_{1}, ldots, x_{j-1}, x_{j+1}, ldots, x_{p}) 时, (y) 与 (x_{j}) 的偏决定系数为:
[egin{aligned}
r_{y j ;-j}^{2} &=frac{operatorname{SSE}left(x_{1}, ldots, x_{j-1}, x_{j+1}, ldots, x_{p}
ight)-operatorname{SSE}left(x_{1}, ldots, x_{p}
ight)}{operatorname{SSE}left(x_{1}, ldots, x_{j-1}, x_{j+1}, ldots, x_{p}
ight)} \
&=frac{S S R-operatorname{SSR}(-j)}{operatorname{SSE}(-j)}
end{aligned}
]
偏相关系数:
[r_{j k}=frac{S_{j k}}{sqrt{S_{j j} S_{k k}}}
]
[r_{12 ; 3, ldots, p}=frac{-Delta_{12}}{sqrt{Delta_{11} Delta_{22}}}
]
[r_{12 ; 3}=frac{r_{12}-r_{13} r_{23}}{sqrt{left(1-r_{13}^{2}
ight)left(1-r_{23}^{2}
ight)}}
]