上节我们通过四种方式定义了一个服从多维正态分布的随机向量,而这一节我们开始讨论随机向量的独立性和条件分布。
- 将(p)维随机向量(Xsim N_p(mu,Sigma))进行分割:
[X=
left[
egin{array}{c}
X^{(1)}_r\
X^{(2)}_{p-r}
end{array}
ight],
mu=
left[
egin{array}{c}
mu^{(1)}_r\
mu^{(2)}_{p-r}
end{array}
ight],
Sigma=
left[
egin{array}{c|c}
Sigma_{11} &Sigma_{12}\ hline
Sigma_{21} &Sigma_{22}
end{array}
ight]>0,(Sigma_{11}为r imes r方阵)
]
一、独立性
设 (p) 维随机向量 (Xsim N_p(mu,Sigma)),
[X=
left[
egin{array}{c}
X^{(1)}\
X^{(2)}
end{array}
ight]sim
left(
left[
egin{array}{c}
mu^{(1)}\
mu^{(2)}
end{array}
ight],
left[
egin{array}{cc}
Sigma_{11} &Sigma_{12}\
Sigma_{21} &Sigma_{22}
end{array}
ight]
ight)
]
则
[X^{(1)}与 X^{(2)}相互独立 leftrightarrows Sigma_{12}=O
]
- 这则充要条件说的是,对于一个服从正态分布的随机向量,若将其划分为两部分,那两个子量互不相关的充要条件是他们的协方差为(O).
(证明)
设(Sigma_{12}=O),则(X)的联合密度函数为:
[egin{align}
f(x^{(1)},x^{(2)})=&
frac1{(2pi)^{p/2}|Sigma|^{1/2}}expleft(-frac12(x-mu)'
left[
egin{array}{cc}
Sigma_{11}&O\
O&Sigma_{22}
end{array}
ight]^{-1}
(x-mu)
ight)\
=&
frac1{(2pi)^{r/2}|Sigma_{11}|^{1/2}}expleft(-frac12(x^{(1)}-mu^{(1)})'
Sigma_{11}^{-1}
(x^{(1)}-mu^{(1)})
ight)\
&cdot
frac1{(2pi)^{(p-r)/2}|Sigma_{22}|^{1/2}}expleft(-frac12(x^{(2)}-mu^{(2)})'
Sigma_{22}^{-1}
(x^{(2)}-mu^{(2)})
ight)\
=&f_1(x^{(1)})cdot f_2(x^{(2)})
end{align}
]
因此(X^{(1)},X^{(2)})相互独立。
(推论)
- 设(r_igeq1,(i=1,dots,k)),且(r_1+r_2+dots+r_k=p),则有
[X=
left[
egin{array}{c}
X^{(1)}\
vdots\
X^{(k)}
end{array}
ight]sim
N_p
left(
left[
egin{array}{c}
mu^{(1)}\
vdots\
mu^{(k)}
end{array}
ight],
left[
egin{array}{ccc}
Sigma_{11} &cdots &Sigma_{1k}\
vdots&&vdots\
Sigma_{k1} &cdots &Sigma_{kk}
end{array}
ight]_{p imes p}
ight)
]
则(X^{(1)},X^{(2)},dots,X^{(k)})相互独立 (leftrightarrows) (Sigma_{ij}=O,(i
eq j)).
- 设(X=(X_1,dots,X_p)'sim N_p(mu,Sigma)),若(Sigma)为对角矩阵,则(X_1,dots,X_p)相互独立。
二、条件分布
对于一个二元正态分布,由条件分布的定义我们知道:当(X_2)给定时,(X_1)的条件密度为:
[f_1(x_1|x_2)=frac{f(x_1,x_2)}{f_2(x_2)}
]
由于我们还不知道(f(x_1|x_2))的通式,但由二元正态分布的联合密度函数我们有:
[f(x_1,x_2)
=(*)\
=frac{1}{2pisigma_1sigma_2sqrt{1-
ho^2}}expleft{-frac{1}{2(1-
ho^2)}[(frac{x_1-mu_1}{sigma_1})^2-2
ho(frac{x_1-mu_1}{sigma_1})(frac{x_2-mu_2}{sigma_2})+(frac{x_2-mu_2}{sigma_2})^2]
ight}
]
简单变形,在指数项内(left(+
ho^2(frac{x_2-mu_2}{sigma_2})^2-
ho^2(frac{x_2-mu_2}{sigma_2})^2
ight))则可得:
[(*)=frac{1}{2pisigma_1sigma_2sqrt{1-
ho^2}}expleft{-frac{1}{2(1-
ho^2)}[(frac{x_1-mu_1}{sigma_1})^2-2
ho(frac{x_1-mu_1}{sigma_1})(frac{x_2-mu_2}{sigma_2})\+(1-
ho^2)(frac{x_2-mu_2}{sigma_2})^2+
ho^2(frac{x_2-mu_2}{sigma_2})^2]
ight}
]
由指数运算性质,我们可以将(Expleft[-frac1{2(1-
ho^2)}(1-
ho^2)(frac{x_2-mu_2}{sigma_2})^2
ight])项提出:
[(*)=frac{1}{sqrt{2pi}sigma_2}expleft{-frac{1}{2}(frac{x_2-mu_2}{sigma_2})^2
ight}\
cdotfrac{1}{sqrt{2pi}sigma_1sqrt{1-
ho^2}}expleft{-frac{1}{2(1-
ho^2)}[(frac{x_1-mu_1}{sigma_1})^2-2
ho(frac{x_1-mu_1}{sigma_1})(frac{x_2-mu_2}{sigma_2})+
ho^2(frac{x_2-mu_2}{sigma_2})^2]
ight}\
]
可以看到第一项就是服从(X_2sim N(mu_2,sigma_2^2))的一元概率密度函数(f_2(x_2)),而第二项经过简单整理可以得出下式:
[(*)=f_2(x_2)cdotfrac{1}{sqrt{2pi}sigma_1sqrt{1-
ho^2}}cdot expleft{-frac{1}{2(1-
ho^2)}[(frac{x_1-mu_1}{sigma_1})-
ho(frac{x_2-mu_2}{sigma_2})]^2
ight}
]
由于(k^2(a+frac{a}{k})^2=(ka+b)^2),经过简单整理得:
[(*)=f_2(x_2)cdotfrac{1}{sqrt{2pi}sigma_1sqrt{1-
ho^2}}cdot expleft{-frac{1}{2(1-
ho^2)sigma_1^2}[x_1-mu_1-
hofrac{sigma_1}{sigma_2}(x_2-mu_2)]^2
ight}\
]
于是我们得到了二元正态分布全概率公式:
[f(x_1,x_2)=f_2(x_2)cdot f(x_1|x_2)
]
其中,(f(x_1|x_2))为给定(x_2)条件下,(x_1)的条件概率密度函数:
[f(x_1|x_2)=frac{1}{sqrt{2pi}sigma_1sqrt{1-
ho^2}}cdot
expleft{
-frac{1}{2(1-
ho^2)sigma_1^2}[x_1-left(mu_1
+
hofrac{sigma_1}{sigma_2}(x_2-mu_2)
ight)]^2
ight}\
]
则可以得到((X_1|X_2))服从正态分布,且:
[(X_1|X_2)sim N_1left(mu_1+
hofrac{sigma_1}{sigma_2}(x_2-mu_2),sigma^2(1-
ho^2)
ight)
]
将其推广到多维:
设
[X=
left[
egin{array}{c}
X^{(1)}_r\
X^{(2)}_{p-r}
end{array}
ight]sim N_p(mu,Sigma),(Sigma>0)
]
则当(X^{(2)})给定时,(X^{(1)})的条件分布为:
[(X^{(1)}|X^{(2)})sim N_r(mu_{1cdot2},Sigma_{11cdot2})
]
其中
[mu_{1cdot2}=mu^{(1)}+Sigma_{12}Sigma_{22}^{-1}(x^{(2)}-mu^{(2)})\
Sigma_{11cdot2}=Sigma_{11}-Sigma_{12}Sigma_{22}^{-1}Sigma_{21}
]
下附证明,而这段证明对于做题事实上非常具有启发性,后面会附上书上的一道课后习题:
(引理-(Sigma)的分块求逆公式)
[left[
egin{array}{c|c}
Sigma_{11}&Sigma_{12}\hline
Sigma_{21}&Sigma_{22}
end{array}
ight]^{-1}
=Sigma^{-1}=left[
egin{array}{c|c}
Sigma_{11.2}^{-1}&-Sigma_{11.2}^{-1}Sigma_{12}Sigma_{22}^{-1}\hline
-Sigma_{22}^{-1}Sigma_{21}Sigma_{11.2}^{-1}&Sigma_{22}^{-1}+Sigma_{22}^{-1}Sigma_{21}Sigma_{11.2}^{-1}Sigma_{12}Sigma_{22}^{-1}
end{array}
ight]
]
其中:(Sigma_{11.2}=Sigma_{11}-Sigma_{12}Sigma_{22}^{-1}Sigma_{21}).
(证明)
我们若想求出((X^{(1)}|X^{(2)}))的分布只需要构造出其概率其密度函数,而由条件分布的定义可知:
[f(X_1,X_2)=f(X_1|X_2)f(X_2)
]
而我们可以通过求解二元条件分布的时候使用的方法一样,通过构造一个非奇异的线性变换:
[egin{align}
Z=left[egin{array}{c}Z^{(1)}\Z^{(2)}end{array}
ight]=&left[egin{array}{c}X^{(1)}-Sigma_{12}Sigma_{22}^{-1}X^{(2)}\X^{(2)}end{array}
ight]\
=&left[egin{array}{c|c}I_r&-Sigma_{12}Sigma_{22}^{-1}\hline O&I_{p-r}end{array}
ight]left[egin{array}{c}X^{(1)}\X^{(2)}end{array}
ight]\
=&BX
end{align}
]
则我们可以得出(Zsim N_p(Bmu,BSigma B')),即:
[egin{align}
BSigma B'=&left[egin{array}{c|c}I_r&-Sigma_{12}Sigma_{22}^{-1}\hline O&I_{p-r}end{array}
ight]left[
egin{array}{c|c}
Sigma_{11}&Sigma_{12}\hline
Sigma_{21}&Sigma_{22}
end{array}
ight]left[egin{array}{c|c}I_r&O\hline -Sigma_{12}Sigma_{22}^{-1}&I_{p-r}end{array}
ight]\
=&left[egin{array}{c|c}Sigma_{11.2}&O\hline O&Sigma_{22}end{array}
ight]
end{align}
]
于是我们可以得出(Z^{(1)},Z^{(2)})相互独立的结论,于是就可以写出(Z)的联合密度函数(g(z^{(1)},z^{(2)})),同时应注意到(Z^{(2)}=X^{(2)}):
[g(z^{(1)},z^{(2)})=g_1(z^{(1)})g_2(z^{(2)})=g_1(z^{(1)})f_2(z^{(2)})
]
另外,因为(Z=BX),利用雅可比行列式,我们可以用(g(z))来表示(X)的密度函数(f(x)):
[egin{align}
f(x^{(1)},x^{(2)})=&g(Bx)cdot J(z o x)\
=&g_1(x^{(1)}-Sigma_{12}Sigma_{22}^{-1}x^{(2)})f_2(x^{(2)})
end{align}
]
再次我们进行总结:
- 我们构造了一个非奇异线性变换,并且证明了(Z)是服从正态分布的随机变量,而且(Z^{(1)},Z^{(2)}=X^{(2)})相互独立;
- 还是通过线性变换的性质,我们借助雅可比行列式,将(X,Z)的密度函数建立起了等式关系。
于是我们通过条件分布的定义,可以轻松写出变量((X_1|X_2))的密度函数为:
[egin{align}
f_1(x^{(1)}|x^{(2)})=&frac{f(x^{(1)},x^{(2)})}{f_2(x^{(2)})}=g_1(x^{(1)}-Sigma_{12}Sigma_{22}^{-1}x^{(2)})\
=&frac{1}{(2pi)^{r/2}|Sigma_{11.2}|^{1/2}}Expleft[-frac12(x^{(1)}-mu_{1.2})'Sigma_{11.2}^{-1}(x^{(1)}-mu_{1.2})
ight]
end{align}
]
由定义得知,该式符合正态分布,即:
[(X^{(1)}|X^{(2)})sim N_r(mu_{1.2},Sigma_{11.2})
]
重要推论!!
- (X^{(1)}-Sigma_{12}Sigma_{22}^{-1}X^{(2)})与(X^{(2)})相互独立;
- (X^{(2)}-Sigma_{21}Sigma_{11}^{-1}X^{(1)})与(X^{(1)})相互独立;
- ((X^{(2)}|X^{(1)})sim N_{p-r}(mu_{2.1},Sigma_{22.1}))且
[mu_{2.1}=mu^{(2)}+Sigma_{21}Sigma_{11}^{-1}(x^{(1)}-mu^{(1)})\
Sigma_{22.1}=Sigma_{22}-Sigma_{21}Sigma_{11}^{-1}Sigma_{12}
]