zoukankan      html  css  js  c++  java
  • 样本方差,协方差,协方差矩阵

    一、样本方差

    设样本均值为$ar x$,样本方差为S2,总体均值为${ m{mu }}$,总体方差为${{ m{sigma }}^2}$,那么样本方差

    ${S^2} = frac{1}{{n - 1}}mathop sum limits_{i = 1}^n {left( {{x_i} - ar x} ight)^2}$

    推导:假设样本数量等于总体数量,应有

     ${S^2} = frac{1}{n}mathop sum limits_{i = 1}^n {left( {{x_i} - ar x} ight)^2}$

    在多次重复抽取样本过程中,样本方差会逐渐接近总体方差,假设每次抽取的样本方差为

    (S12,S22,S32…),然后对这些样本方差求平均值记为E(S2),则

    ${ m{E}}left( {{{ m{S}}^2}} ight) = { m{E}}left( {frac{1}{n}mathop sum limits_{i = 1}^n {{left( {{x_i} - ar x} ight)}^2}} ight)$

    $ = { m{E}}left( {frac{1}{n}mathop sum limits_{i = 1}^n {{left( {left( {{x_i} - mu } ight) - left( {ar x - mu } ight)} ight)}^2}} ight)$

    因为

    $frac{1}{n}mathop sum limits_{i = 1}^n left( {{x_i} - mu } ight) = frac{1}{n}mathop sum limits_{i = 1}^n {x_i} - mu  = ar x - mu $

    接上式

    ${ m{E}}left( {frac{1}{n}mathop sum limits_{i = 1}^n {{left( {left( {{x_i} - mu } ight) - left( {ar x - mu } ight)} ight)}^2}} ight) = { m{E}}left( {frac{1}{n}mathop sum limits_{i = 1}^n {{left( {{x_i} - mu } ight)}^2} - frac{1}{n}mathop sum limits_{i = 1}^n 2({x_i} - mu )left( {ar x - mu } ight) + frac{1}{n}mathop sum limits_{i = 1}^n {{left( {{x_i} - mu } ight)}^2}} ight)$

    $ = { m{E}}left( {frac{1}{n}mathop sum limits_{i = 1}^n {{left( {{x_i} - mu } ight)}^2} - 2left( {ar x - mu } ight)left( {ar x - mu } ight) + {{left( {ar x - mu } ight)}^2}} ight)$

    $ = { m{;E}}left( {frac{1}{n}mathop sum limits_{i = 1}^n {{left( {{x_i} - mu } ight)}^2} - {{left( {ar x - mu } ight)}^2}} ight)$

    $ = { m{E}}left( {frac{1}{n}mathop sum limits_{i = 1}^n {{left( {{x_i} - mu } ight)}^2}} ight) - E({left( {ar x - mu } ight)^2}) le {sigma ^2}$

    所以样本方差除以n会小于总体方差

    ${ m{E}}left( {frac{1}{n}mathop sum limits_{i = 1}^n {{left( {{x_i} - mu } ight)}^2}} ight) - E({left( {ar x - mu } ight)^2}) = {sigma ^2} - frac{1}{n}{sigma ^2} = frac{{n - 1}}{n}{sigma ^2}$

    所以样本方差与总体方差差(n-1)/n倍。

    二、协方差

    协方差是对两个随机变量联合分布线性相关程度的一种度量。两个随机变量越线性相关,协方差越大,完全线性无关,协方差为零。

    Cov(x,y) = E[(x-E(x))(y-E(y))]

    特殊的当只存在一个变量x,x与自身的协方差等于方差,记作Var(x)

    Cov(x,x) =Var(x)= E[(x-E(x))(x-E(x))]

    样本协方差

    对于多维随机变量Q(x1,x2,x3,…,xn),样本集合为xij=[x1j,x2j,…,xnj](j=1,2,…,m),m为样本数量,在a,b(a,b=1,2…n)两个维度内

    ${ m{cov}}left( {{{ m{x}}_{ m{a}}},{{ m{x}}_{ m{b}}}} ight) = frac{{mathop sum olimits_{j = 1}^m left( {{x_{aj}} - {{ar x}_a}} ight)left( {{x_{bj}} - {{ar x}_b}} ight)}}{{m - 1}}$

    三、协方差矩阵

    对于多维随机变量Q(x1,x2,x3,…,xn)我们需要对任意两个变量(xi,xj)求线性关系,即需要对任意两个变量求协方差矩阵

    Cov(xi,xj)= E[(xi-E(xi))(xj-E(xj))]

    [{ m{cov}}left( {{x_i},{x_j}} ight) = left[ {egin{array}{*{20}{c}}
    {{ m{cov}}left( {{x_1},{x_1}} ight)}&{{ m{cov}}left( {{x_1},{x_2}} ight)}&{{ m{cov}}left( {{x_1},{x_3}} ight)}& cdots &{{ m{cov}}left( {{x_1},{x_{ m{n}}}} ight)}\
    {{ m{cov}}left( {{x_2},{x_1}} ight)}&{{ m{cov}}left( {{x_2},{x_2}} ight)}&{{ m{cov}}left( {{x_2},{x_3}} ight)}& cdots &{{ m{cov}}left( {{x_2},{x_n}} ight)}\
    {{ m{cov}}left( {{x_3},{x_1}} ight)}&{{ m{cov}}left( {{x_3},{x_2}} ight)}&{{ m{cov}}left( {{x_3},{x_3}} ight)}& cdots &{{ m{cov}}left( {{x_3},{x_n}} ight)}\
    vdots & vdots & vdots & ddots & vdots \
    {{ m{cov}}left( {{x_m}{x_1}} ight)}&{{ m{cov}}left( {{x_m},{x_2}} ight)}&{{ m{cov}}left( {{x_m},{x_3}} ight)}& cdots &{{ m{cov}}left( {{x_m},{x_n}} ight)}
    end{array}} ight]]

     

    【 结束 】

  • 相关阅读:
    由基于qml,c++的串口调试工具浅谈qml与c++混合编程
    qt5_qml_Opengl_shader 第一弹----------------------openglunderqml的简化及介绍
    Delphi 的接口机制——接口操作的编译器实现过程(2)
    Delphi 的接口机制——接口操作的编译器实现过程(1)
    ddd
    [leetcode]Gray Code
    synapse socket总结一:服务器模型
    CentOS 6.5(64bit)安装GCC4.8.2+Qt5.2.1(替换GCC的链接库)
    Qt打开外部程序和文件夹需要注意的细节(Qt调用VC写的动态库,VC需要用C的方式输出函数,否则MinGW32编译过程会报错)
    Qt+SQLite数据加密的一种思路(内存数据库)
  • 原文地址:https://www.cnblogs.com/fujj/p/9720357.html
Copyright © 2011-2022 走看看