zoukankan      html  css  js  c++  java
  • Exponential family of distributions

    Choi H. I. Lecture 4: Exponential family of distributions and generalized linear model (GLM).

    定义

    定义: 一个分布具有如下形式的密度函数:

    [f_{ heta}(x) = frac{1}{Z( heta)} h(x) e^{langle T(x), heta angle}, ]

    则该分布属于指数族分布.
    其中(x in mathbb{R}^m), (T(x) = (T_1(x), T_2(x), cdots, T_k(x)) in mathbb{R}^k), ( heta = ( heta_1, heta_2,cdots, heta_k))为未知参数, (Z( heta) = int h(x)e^{langle T(x), heta angle} mathrm{d}x)为配平常数.

    若令(C(x) = log h (x)), (A( heta) = log Z( heta)), 则

    [f_{ heta}(x) = exp (langle T(x), heta angle - A( heta) + C(x)). ]

    指数族分布还有一种更一般的形式:

    [f_{ heta}(x) = exp (frac{langle T(x), heta angle - A( heta)}{phi} + C(x, phi)), ]

    更甚者

    [f_{ heta}(x) = exp (frac{langle T(x), lambda( heta) angle - A( heta)}{phi} + C(x, phi)), ]

    (phi)控制分布的形状.

    性质

    (A( heta))

    Proposition 1:

    [ abla_{ heta}A( heta) = int f_{ heta}(x) T(x) mathrm{d}x = mathbb{E}[T(X)]. ]

    proof:

    已知:

    [int f_{ heta}(x) mathrm{d}x = int exp (frac{langle T(x), heta angle - A( heta)}{phi} + C(x, phi)) mathrm{d}x = 1. ]

    两边关于( heta)求梯度得:

    [int f_{ heta}(x) frac{T(x) - abla_{ heta} A( heta)}{phi} mathrm{d}x = 0 Rightarrow abla_{ heta} A( heta) = mathbb{E}[T(X)]. ]

    Proposition 2:

    [D^2_{ heta} A = (frac{partial^2 A}{partial heta_i partial heta_j}) = frac{1}{phi}mathrm{Cov}(T(X), T(X)) = frac{1}{phi}Cov(T(X)). ]

    proof:

    [frac{partial A}{partial heta_i} = int exp (frac{langle T(x), heta angle - A( heta)}{phi} + C(x, phi)) T_i(x) mathrm{d}x. ]

    [egin{array}{ll} frac{partial^2 A}{partial heta_i partial heta_j} &= int f_{ heta}(x) frac{T_j (x) - frac{partial A}{partial heta_j}}{phi} T_i(x) mathrm{d}x \ &= frac{1}{phi}int f_{ heta}(x) (T_j(x) - frac{partial A}{partial heta_j}) (T_i(x) - frac{partial A}{partial heta_i})mathrm{d}x \ &= mathrm{Cov}(T_i(X), T_j(X)). end{array} ]

    Corollary 1: (A({ heta}))关于( heta)是凸函数.

    既然其黑塞矩阵半正定.

    极大似然估计

    设有({x^i}_{i=1}^n)个样本, 则对数似然函数为

    [l( heta) = frac{1}{ heta}[langle heta, sum_{i=1}^n T(x^i)-nA( heta)] + sum_{i=1}^n C(x^i, phi), ]

    因为(A( heta))是凸函数, 所以上述存在最小值点, 且

    [ abla_{ heta} l( heta) = frac{1}{phi}[sum_{i=1}^n T(x^i) - n abla_{ heta}A( heta)], ]

    故该最小值点在

    [ abla_{ heta}A( heta) = frac{1}{n} sum_{i=1}^n T(x^i), ]

    处达到.

    最大熵

    最大熵原理-科学空间

    指数族分布实际上满足最大熵分布, 这是在没有任何偏爱的尺度下的分布.

    [max_{f} quad H(f) = -int f(x)log f(x) mathrm{d} x. ]

    等价于最小化

    [min_f int f(x)log f(x) mathrm{d}x. ]

    往往, 我们会有一些已知的统计信息, 通常以期望的形式表示:

    [int f(x) h_i(x) mathrm{d}x = c_i, quad i=1,2cdots, s. ]

    则我们的目标实际上是:

    [min_f quad int f(x)log f(x) mathrm{d}x \ mathrm{s.t.} quad int f(x) h_i(x) mathrm{d}x = c_i, quad i=0,2cdots, s. ]

    其中(h_0 = 1, c_0 =1), 即密度函数需满足(int f(x) mathrm{d} x= 1).

    利用拉格朗日乘数得:

    [J(f,lambda) = int f(x)log f(x) mathrm{d}x + lambda_0 (1 - int f(x) mathrm{d}x) + sum_{i=1}^s lambda_i [c_i - int f(x) h_i(x) mathrm{d}x] . ]

    最优条件, (J)关于(f)的变分为0, 即

    [1 + log f(x) - lambda_0 - sum_{i=1}^s lambda_i h_i(x) = 0. ]

    [f(x) = frac{1}{Z} exp(sum_{i=1}^s lambda_i h_i(x)). ]

    属于指数分布族.

    例子

    Bernoulli

    [P(x) = p^x (1-p)^{1-x} = exp[xlogfrac{p}{1-p} + log (1 - p)]. ]

    [ heta = log frac{p}{1-p}, \ T(x) = x, \ A( heta) = log (1 + e^{ heta}),\ h(x) = 0. ]

    指数分布

    [p(x) = lambda cdot e^{-lambda x}=exp[-lambda x +log lambda ], quad x ge 0. ]

    [ heta = lambda,\ T(x) =-x, \ A( heta) = log frac{1}{lambda}, \ h(x) = mathbb{I}(xge0). ]

    正态分布

    [p(x) = frac{1}{sqrt{2pi sigma^2}} exp [-frac{(x-mu)^2}{2sigma^2}]. ]

    (sigma)视作已知参数:

    [p(x) = exp [frac{-frac{1}{2}x^2 + xmu - frac{1}{2}mu^2}{sigma^2} - frac{1}{2}log (2pi sigma^2)]. ]

    [ heta = (mu, 1), \ T(x) = (x, -frac{1}{2}x^2), \ phi = sigma^2, \ A( heta) = frac{1}{2}mu^2, \ C(x, phi) = frac{1}{2} log (2pi sigma^2). ]

    (sigma)视作未知参数:

    [p(x) = exp [-frac{1}{2sigma^2}y^2 + frac{mu}{sigma^2}x - frac{1}{2sigma^2}mu^2 - log sigma - frac{1}{2}log 2pi]. ]

    [T(x) = (x, frac{1}{2}x^2), \ heta = (frac{mu}{sigma^2}, -frac{1}{sigma^2}), \ A( heta) = frac{mu^2}{2sigma^2} + logsigma, \ C(x) = -frac{1}{2}log(2pi). ]

  • 相关阅读:
    NOI2004 郁闷的出纳员 [Splay]
    关押罪犯 [二分]
    SCOI2010 传送带 [三分/模拟退火]
    POI2007 MEG-Megalopolis [树状数组]
    食物链 [并查集]
    SDOI2011 染色 [树链剖分]
    国家集训队 数颜色 [莫队]
    JSOI2008 星球大战 [并查集]
    [NOI2002] 银河英雄传说 (带权并查集)
    种树 [堆]
  • 原文地址:https://www.cnblogs.com/MTandHJ/p/14852936.html
Copyright © 2011-2022 走看看