zoukankan      html  css  js  c++  java
  • HMM-前向后向算法理解与实现(python)

    HMM-前向后向算法理解与实现(python)
    HMM-维特比算法理解与实现(python)

    基本要素

    • 状态 (N)

    • 状态序列 (S = s_1,s_2,...)

    • 观测序列 (O=O_1,O_2,...)

    • (lambda(A,B,pi))

      • 状态转移概率 (A = {a_{ij}})
      • 发射概率 (B = {b_{ik}})
      • 初始概率分布 (pi = {pi_i})
    • 观测序列生成过程

      • 初始状态
      • 选择观测
      • 状态转移
      • 返回step2

    HMM三大问题

    • 概率计算问题(评估问题)

    给定观测序列 (O=O_1O_2...O_T),模型 (lambda (A,B,pi)),计算 (P(O|lambda)),即计算观测序列的概率

    • 解码问题

    给定观测序列 (O=O_1O_2...O_T),模型 (lambda (A,B,pi)),找到对应的状态序列 (S)

    • 学习问题

    给定观测序列 (O=O_1O_2...O_T),找到模型参数 (lambda (A,B,pi)),以最大化 (P(O|lambda))

    概率计算问题

    给定模型 (lambda) 和观测序列 (O),如何计算(P(O| lambda))

    暴力枚举每一个可能的状态序列 (S)

    • 对每一个给定的状态序列

      [P(O|S,lambda) = prod^T_{t=1} P(O_t|s_t,lambda) =prod^T_{t=1} b_{s_tO_t} ]

    • 一个状态序列的产生概率

      [P(S|lambda) = P(s_1)prod^T_{t=2}P(s_t|s_{t-1})=pi_1prod^T_{t=2}a_{s_{t-1}s_t} ]

    • 联合概率

      [P(O,S|lambda) = P(S|lambda)P(O|S,lambda) =pi_1prod^T_{t=2}a_{s_{t-1}s_t}prod^T_{t=1} b_{s_tO_t} ]

    • 考虑所有的状态序列

      [P(O|lambda)=sum_Spi_1b_{s_1O_1}prod^T_{t=2}a_{s_{t-1}s_t}b_{s_tO_t} ]

    (O) 可能由任意一个状态得到,所以需要将每个状态的可能性相加。

    这样做什么问题?时间复杂度高达 (O(2TN^T))。每个序列需要计算 (2T) 次,一共 (N^T) 个序列。

    前向算法

    在时刻 (t),状态为 (i) 时,前面的时刻观测到 (O_1,O_2, ..., O_t) 的概率,记为 (alpha _i(t))

    [alpha_{i}(t)=Pleft(O_{1}, O_{2}, ldots O_{t}, s_{t}=i | lambda ight) ]

    (t=1) 时,输出为 (O_1),假设有三个状态,(O_1) 可能是任意一个状态发出,即

    [P(O_1|lambda) = pi_1b_1(O_1)+pi_2b_2(O_1)+pi_2b_3(O_1) = alpha_1(1)+alpha_2(1)+alpha_3(1) ]

    image-20200511202908264

    (t=2) 时,输出为 (O_1O_2)(O_2) 可能由任一个状态发出,同时产生 (O_2) 对应的状态可以由 (t=1) 时刻任意一个状态转移得到。假设 (O_2) 由状态 1 发出,如下图

    image-20200511203749699

    [P(O_1O_2,s_2=q_1|lambda) = pi_1b_1(O_1)a_{11}b_1(O_2)+pi_2b_2(O_1)a_{21}b_1(O_2)+pi_2b_3(O_1)a_{31}b_1(O_2) \ =old{alpha_1(1)}a_{11}b_1(O_2)+old{alpha_2(1)}a_{21}b_1(O_2)+old{alpha_3(1)}a_{31}b_1(O_2) = old{alpha_1(2)} ]

    同理可得 (alpha_2(2),alpha_3(2))

    [old{alpha_2(2)} = P(O_1O_2,s_2=q_2|lambda) =old{alpha_1(1)}a_{12}b_2(O_2)+old{alpha_2(1)}a_{22}b_2(O_2)+old{alpha_3(1)}a_{32}b_2(O_2) \ old{alpha_3(2)} = P(O_1O_2,s_2=q_3|lambda) =old{alpha_1(1)}a_{13}b_3(O_2)+old{alpha_2(1)}a_{23}b_3(O_2)+old{alpha_3(1)}a_{33}b_3(O_2) ]

    所以

    [P(O_1O_2|lambda) =P(O_1O_2,s_2=q_1|lambda)+ P(O_1O_2,s_2=q_2|lambda) +P(O_1O_2,s_2=q_3|lambda)\ = alpha_1(2)+alpha_2(2)+alpha_3(2) ]

    所以前向算法过程如下:

    ​ step1:初始化 (alpha_i(1)= pi_i*b_i(O_1))

    ​ step2:计算 (alpha_i(t) = (sum^{N}_{j=1} alpha_j(t-1)a_{ji})b_i(O_{t}))

    ​ step3:(P(O|lambda) = sum^N_{i=1}alpha_i(T))

    相比暴力法,时间复杂度降低了吗?

    当前时刻有 (N) 个状态,每个状态可能由前一时刻 (N) 个状态中的任意一个转移得到,所以单个时刻的时间复杂度为 (O(N^2)),总时间复杂度(O(TN^2))

    代码实现

    例子:

    假设从三个 袋子 {1,2,3}中 取出 4 个球 O={red,white,red,white},模型参数(lambda = (A,B,pi)) 如下,计算序列O出现的概率

    #状态 1 2 3
    A = [[0.5,0.2,0.3],
    	 [0.3,0.5,0.2],
    	 [0.2,0.3,0.5]]
    
    pi = [0.2,0.4,0.4]
    
    # red white
    B = [[0.5,0.5],
    	 [0.4,0.6],
    	 [0.7,0.3]]
    

    ​ step1:初始化 (alpha_i(1)= pi_i*b_i(O_1))

    ​ step2:计算 (alpha_i(t) = (sum^{N}_{j=1} alpha_j(t-1)a_{ji})b_i(O_{t}))

    ​ step3:(P(O|lambda) = sum^N_{i=1}alpha_i( T))

    #前向算法
    def hmm_forward(A,B,pi,O):
        T = len(O)
        N = len(A[0])
        #step1 初始化
        alpha = [[0]*T for _ in range(N)]
        for i in range(N):
            alpha[i][0] = pi[i]*B[i][O[0]]
    
        #step2 计算alpha(t)
        for t in range(1,T):
            for i in range(N):
                temp = 0
                for j in range(N):
                    temp += alpha[j][t-1]*A[j][i]
                alpha[i][t] = temp*B[i][O[t]]
                
        #step3
        proba = 0
        for i in range(N):
            proba += alpha[i][-1]
        return proba,alpha
    
    A = [[0.5,0.2,0.3],[0.3,0.5,0.2],[0.2,0.3,0.5]]
    B = [[0.5,0.5],[0.4,0.6],[0.7,0.3]]
    pi = [0.2,0.4,0.4]
    O = [0,1,0,1]
    hmm_forward(A,B,pi,O)  #结果为 0.06009
    

    结果

    image-20200512195503450

    后向算法

    在时刻 (t),状态为 (i) 时,观测到 (O_{t+1},O_{t+2}, ..., O_T) 的概率,记为 (eta _i(t))

    [eta_{i}(t)=Pleft(O_{t+1},O_{t+2}, ..., O_T | s_{t}=i, lambda ight) ]

    (t=T) 时,由于 (T) 时刻之后为空,没有观测,所以 (eta_i(t)=1)

    (t = T-1) 时,观测 (O_T)(O_T) 可能由任意一个状态产生

    [eta_i(T-1) = P(O_T|s_{t}=i,lambda) = a_{i1}b_1(O_T)eta_1(T)+a_{i2}b_2(O_T)eta_2(T)+a_{i3}b_3(O_T)eta_3(T) ]

    image-20200511214910979

    (t=1) 时,观测为 (O_{2},O_{3}, ..., O_T)

    [egin{aligned} eta_1(1) &= P(O_{2},O_{3}, ..., O_T|s_1=1,lambda)\ &=a_{11}b_1(O_2)eta_1(2)+a_{12}b_2(O_2)eta_2(2)+a_{13}b_3(O_2)eta_3(2) \ quad \ eta_2(1) &= P(O_{2},O_{3}, ..., O_T|s_1=2,lambda)\ &=a_{21}b_1(O_2)eta_1(2)+a_{22}b_2(O_2)eta_2(2)+a_{23}b_3(O_2)eta_3(2) \ quad \ eta_3(1) &=P(O_{2},O_{3}, ..., O_T|s_1=3,lambda)\ &=a_{31}b_1(O_2)eta_1(2)+a_{32}b_2(O_2)eta_2(2)+a_{33}b_3(O_2)eta_3(2) end{aligned} ]

    所以

    [P(O_{2},O_{3}, ..., O_T|lambda) = eta_1(1)+eta_2(1)+eta_3(1) ]

    后向算法过程如下:

    ​ step1:初始化 (eta_i(T)=1)

    ​ step2:计算 (eta_i(t) = sum^N_{j=1}a_{ij}b_j(O_{t+1})eta_j(t+1))

    ​ step3:(P(O|lambda) = sum^N_{i=1}pi_ib_i(O_1)eta_i(1))

    • 时间复杂度 (O(N^2T))

    代码实现

    还是上面的例子

    #后向算法
    def hmm_backward(A,B,pi,O):
        T = len(O)
        N = len(A[0])
        #step1 初始化
        beta = [[0]*T for _ in range(N)]
        for i in range(N):
            beta[i][-1] = 1
            
        #step2 计算beta(t)
        for t in reversed(range(T-1)):
            for i in range(N):
                for j in range(N):
                    beta[i][t]  += A[i][j]*B[j][O[t+1]]*beta[j][t+1]
                
        #step3
        proba = 0
        for i in range(N):
            proba += pi[i]*B[i][O[0]]*beta[i][0]
        return proba,beta
    
    A = [[0.5,0.2,0.3],[0.3,0.5,0.2],[0.2,0.3,0.5]]
    B = [[0.5,0.5],[0.4,0.6],[0.7,0.3]]
    pi = [0.2,0.4,0.4]
    O = [0,1,0,1]
    hmm_backward(A,B,pi,O)  #结果为 0.06009
    

    结果

    image-20200512195526215

    前向-后向算法

    image-20200511201506794

    回顾前向、后向变量:

    • (a_i(t)) 时刻 (t),状态为 (i) ,观测序列为 (O_1,O_2, ..., O_t) 的概率
    • (eta_i(t)) 时刻 (t),状态为 (i) ,观测序列为 (O_{t+1},O_{t+2}, ..., O_T) 的概率

    [egin{aligned} P(O,s_t=i|lambda) &= P(O_1,O_2, ..., O_T,s_t=i|lambda)\ &= P(O_1,O_2, ..., O_t,s_t=i,O_{t+1},O_{t+2}, ..., O_T|lambda)\ &= P(O_1,O_2, ..., O_t,s_t=i|lambda)*P(O_{t+1},O_{t+2}, ..., O_T|O_1,O_2, ..., O_t,s_t=i,lambda) \ &= P(O_1,O_2, ..., O_t,s_t=i|lambda)*P(O_{t+1},O_{t+2}, ..., O_T,s_t=i|lambda)\ &= a_i(t)*eta_i(t) end{aligned} ]

    即在给定的状态序列中,(t) 时刻状态为 (i) 的概率。

    使用前后向算法可以计算隐状态,记 (gamma_i(t) = P(s_t=i|O,lambda)) 表示时刻 (t) 位于隐状态 (i) 的概率

    [Pleft(s_{t}=i, O | lambda ight)=alpha_{i}(t) eta_{i}(t) ]

    [egin{aligned} gamma_{i}(t) &=Pleft(s_{t}={i} | O, lambda ight)=frac{Pleft(s_{t}={i}, O | lambda ight)}{P(O | lambda)} \ &=frac{alpha_{i}(t) eta_{i}(t)}{P(O | lambda)}=frac{alpha_{i}(t) eta_{i}(t)}{sum_{i=1}^{N} alpha_{i}(t) eta_{i}(t)} end{aligned} ]

    references:

    [1] https://www.cs.sjsu.edu/~stamp/RUA/HMM.pdf

    [2]https://www.cnblogs.com/fulcra/p/11065474.html

    [3] https://www.cnblogs.com/sjjsxl/p/6285629.html

    [4] https://blog.csdn.net/xueyingxue001/article/details/52396494

  • 相关阅读:
    使用VisualVM检测
    《自控力》读书笔记
    【转】Oracle索引的类型
    【转】Oracle索引HINT的使用
    【转】Oracle 执行计划(Explain Plan) 说明
    Oracle分区
    【转】Oracle索引失效问题
    【转】《从入门到精通云服务器》第七讲—负载均衡和CDN技术
    【转】《从入门到精通云服务器》第七讲—IAAS、PAAS、SAAS
    【转】《从入门到精通云服务器》第六讲—OpenStack基础
  • 原文地址:https://www.cnblogs.com/gongyanzh/p/12880387.html
Copyright © 2011-2022 走看看