zoukankan      html  css  js  c++  java
  • EM算法和高斯混合模型GMM介绍

    EM算法

    EM算法主要用于求概率密度函数参数的最大似然估计,将问题$arg max _{ heta_{1}} sum_{i=1}^{n} ln pleft(x_{i} | heta_{1} ight)$转换为更加易于计算的$sum_{i=1}^{n} ln pleft(x_{i}, heta_{2} | heta_{1} ight)$,其中$ heta_2$可以取任意的先验分布$q( heta_2)$。EM算法的推导过程如下:$$egin{aligned} ln pleft(x | heta_{1} ight) &=int qleft( heta_{2} ight) ln pleft(x | heta_{1} ight) d heta_{2}=int qleft( heta_{2} ight) ln frac{pleft(x, heta_{2} | heta_{1} ight)}{pleft( heta_{2} | x, heta_{1} ight)} d heta_{2}=int qleft( heta_{2} ight) ln frac{pleft(x, heta_{2} | heta_{1} ight) qleft( heta_{2} ight)}{pleft( heta_{2} | x, heta_{1} ight) qleft( heta_{2} ight)} d heta_{2}   \ &=underbrace{int qleft( heta_{2} ight) ln frac{pleft(x, heta_{2} | heta_{1} ight)}{qleft( heta_{2} ight)} d heta_{2}}_{ ext { define this to }mathcal{L}left(x, heta_1 ight)}+underbrace{int qleft( heta_{2} ight) ln frac{qleft( heta_{2} ight)}{pleft( heta_{2} | x, heta_{1} ight)} d heta_{2}}_{ ext { Kullback-Leibler divergence }} end{aligned}$$利用凸函数的性质,$ ext{KL divergence}=Eleft[-ln frac{pleft( heta_{2} | x, heta_{1} ight)}{qleft( heta_{2} ight)} ight]geq{-ln{Eleft[frac{pleft( heta_{2} | x, heta_{1} ight)}{qleft( heta_{2} ight)} ight]}}=-ln{1}=0$,当且仅当$qleft( heta_{2} ight)=pleft( heta_{2} | x, heta_{1} ight)$时$ ext{KL divergence}$取值为0。

    基于以上推导,EM算法的计算流程如下:

           给定初始值$ heta_1^{(0)}$,按以下步骤迭代至收敛(以第t+1步为例):

    • E-step: 令$q_{t}left( heta_{2} ight)=pleft( heta_{2} | x, heta_{1}^{(t)} ight)$,则$mathcal{L}_{t}left(x, heta_{1} ight)=int q_{t}left( heta_{2} ight) ln pleft(x, heta_{2} | heta_{1} ight) d heta_{2}-underbrace{int q_{t}left( heta_{2} ight) ln q_{t}left( heta_{2} ight) d heta_{2}}_{ ext { can ignore this term }}$
    • M-step: 令$ heta_{1}^{(t+1)}=arg max _{ heta_{1}} mathcal{L}_{t}left(x, heta_{1} ight)$

    算法解释:

    $$
    egin{aligned} ln pleft(x | heta_{1}^{(t)} ight) &=mathcal{L}_{t}left(x, heta_{1}^{(t)} ight)+underbrace{K Lleft(q_tleft( heta_{2} ight) | pleft( heta_{2} | x_{1}, heta_{1}^{(t)} ight) ight)}_{=0 ext { by setting } q=p}quad leftarrow ext { E-step } \ & leq mathcal{L}_{t}left(x, heta_{1}^{(t+1)} ight) quad leftarrow ext { M-step } \ & leq mathcal{L}_{t}left(x, heta_{1}^{(t+1)} ight)+underbrace{K Lleft(q_{t}left( heta_{2} ight) | pleft( heta_{2} | x_{1}, heta_{1}^{(t+1)} ight) ight)}_{>0 ext { because } q eq p} \ &=ln pleft(x | heta_{1}^{(t+1)} ight)end{aligned}
    $$

    高斯混合模型GMM

    高斯混合模型是一个用于聚类的概率模型,对于数据$vec{x}_1,vec{x}_2,cdots,vec{x}_n$中的任一数据$vec{x}_i$,$c_i$表示$vec{x}_i$被分配到了第$c_i$个簇中,并且$c_iin{1,2,cdots,K}$。模型定义如下:

    1. Prior cluster assignment: $c_{i} stackrel{ ext { iid }}{sim}$ Discrete $(vec{pi}) Rightarrow operatorname{Prob}left(c_{i}=k | vec{pi} ight)=pi_{k}$
    2. Generate observation: $vec{x}_i sim Nleft(vec{mu}_{c_{i}}, Sigma_{c_{i}} ight)$

    模型需要求解的就是先验概率$vec{pi}=(pi_1,pi_2,cdots,pi_K)$,各簇高斯分布的均值${vec{mu}_1,vec{mu}_2,cdots,vec{mu}_K}$以及协方差矩阵${Sigma_1,Sigma_2,cdots,Sigma_K}$这些量。为了求解这些量,使用最大似然估计,定义需最大化的目标函数为

    $$sum_{i=1}^{n} ln pleft(vec{x}_{i} | vec{pi}, oldsymbol{mu}, oldsymbol{Sigma} ight) ext{, where }oldsymbol{mu}={vec{mu}_1,vec{mu}_2,cdots,vec{mu}_K} ext{ and }oldsymbol{Sigma}={Sigma_1,Sigma_2,cdots,Sigma_K}$$

    利用EM算法求解上式的最大值,将上式写为$$sum_{i=1}^{n} ln pleft(vec{x}_{i} | vec{pi}, oldsymbol{mu}, oldsymbol{Sigma} ight)=sum_{i=1}^{n} underbrace{sum_{k=1}^{K} qleft(c_{i}=k ight) ln frac{pleft(vec{x}_{i}, c_{i}=k | vec{pi}, oldsymbol{mu}, oldsymbol{Sigma} ight)}{qleft(c_{i}=k ight)}}_{mathcal{L}}+sum_{i=1}^nunderbrace{sum_{k=1}^{K} qleft(c_{i}=k ight) ln frac{qleft(c_{i}=k ight)}{pleft(c_{i}=k | vec{x}_{i}, vec{pi}, oldsymbol{mu}, oldsymbol{Sigma} ight)}}_{ ext{KL divergence}}$$

    • E-step: 根据贝叶斯法则,令$q_tleft(c_{i}=k ight)=pleft(c_{i}=k | vec{x}_{i}, vec{pi}^{(t)}, mu^{(t)}, Sigma^{(t)} ight)propto pleft(c_{i}=k | vec{pi}^{(t)} ight) pleft(vec{x}_{i} | c_{i}=k, oldsymbol{mu}^{(t)}, oldsymbol{Sigma}^{(t)} ight)$,容易看出$$q_tleft(c_{i}=k ight)=frac{pi_{k}^{(t)} Nleft(vec{x}_{i} | vec{mu}_{k}^{(t)}, Sigma_{k}^{(t)} ight)}{sum_{j} pi_{j}^{(t)} Nleft(vec{x}_{i} | vec{mu}_{j}^{(t)}, Sigma_{j}^{(t)} ight)}$$
    • M-step: $$argmax_{vec{pi}, oldsymbol{mu}, oldsymbol{Sigma}}sum_{i=1}^{n} sum_{k=1}^{K}  q_tleft(c_{i}=k ight)ln pleft(vec{x}_{i}, c_{i}=k | vec{pi}, oldsymbol{mu}, oldsymbol{Sigma} ight)=argmax_{vec{pi}, oldsymbol{mu}, oldsymbol{Sigma}}sum_{i=1}^{n} sum_{k=1}^{K}  q_tleft(c_{i}=k ight)left[ln pi_k+ln Nleft(vec{x}_{i} | vec{mu}_{k}, Sigma_{k} ight) ight]$$可以得出$pi_{k}^{(t+1)}=frac{sum_{i=1}^{n}q_tleft(c_i=k ight)}{sum_{j=1}^{K}sum_{i=1}^{n}q_tleft(c_i=j ight)}=frac{sum_{i=1}^{n}q_tleft(c_i=k ight)}{n}, quadvec{mu}_{k}^{(t+1)}=frac{sum_{i=1}^{n} q_tleft(c_i=k ight) vec{x}_{i}}{sum_{i=1}^{n}q_tleft(c_i=k ight)}, quad  Sigma_{k}^{(t+1)}=frac{ sum_{i=1}^{n} q_tleft(c_i=k ight)left(vec{x_{i}}-vec{mu}_{k}^{(t+1)} ight)left(vec{x}_{i}-vec{mu}_{k}^{(t+1)} ight)^{T}}{sum_{i=1}^{n}q_tleft(c_i=k ight)}$
  • 相关阅读:
    Re: 求助:5道算法题
    AutoComplete的字典建立和单词查找算法实现
    求教大牛!关于后缀树
    Qt OpenGL教程
    调试宏
    if结合errorlevel使用:判断一个DOS命令执行成功与否
    写给自己,关于对纯技术的追求,以及为了金钱与前途的技术追求
    <转>我对菜鸟成长的看法
    看完电影有感。。。。。
    <转>标准C++的类型转换符:static_cast、dynamic_cast、reinterpret_cast、和const_cast
  • 原文地址:https://www.cnblogs.com/sunwq06/p/11052072.html
Copyright © 2011-2022 走看看