zoukankan      html  css  js  c++  java
  • hmmlearn使用简介

    隐含马尔可夫模型(Hidden Markov Model,HMM)最初是在20世纪60年代后半期,由Leonard E. Baum和其他一些作者在一系列统计学论文中描述的。其最初应用于语音识别领域。

    1980年代后半期,HMM开始应用到生物序列,尤其是DNA序列的分析中。随后,在生物信息学领域,HMM逐渐成为一项不可或缺的技术。

    本文内容包含来自:
    [1] 用hmmlearn学习隐马尔科夫模型HMM
    [2] 官方文档

    1. hmmlearn

    hmmlearn曾经是scikit-learn项目的一部分,现已独立成单独的Python包,可直接通过pip进行安装,为无监督隐马尔可夫模型。其官方文档网址为https://hmmlearn.readthedocs.io/en/stable/。其有监督的版本为seqlearn。

    pip3 install hmmlearn
    

    hmmlearn提供三种模型:

    名称 简介 观测状态
    hmm.GaussianHMM Hidden Markov Model with Gaussian emissions. 连续
    hmm.GMMHMM Hidden Markov Model with Gaussian mixture emissions. 连续
    hmm.MultinomialHMM Hidden Markov Model with multinomial (discrete) emissions 离散

    2. MultinomialHMM

    方法声明为

    class hmmlearn.hmm.MultinomialHMM(n_components=1, startprob_prior=1.0, transmat_prior=1.0,
    algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False,  params='ste', init_params='ste')
    

    其中,较为常用(或将更新)的参数为:

    • n_components:(int)隐含状态个数
    • n_iter:(int, optional)训练时循环(迭代)最大次数
    • tol:(float, optional)Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
    • verbose:(bool, optional)赋值为True时,会向标准输出输出每次迭代的概率(score)与本次
    • init_params:(string, optional)决定哪些参数会在训练时被初始化。‘s’ for startprob, ‘t’ for transmat, ‘e’ for emissionprob。空字符串""代表全部使用用户提供的参数进行训练。

    2.1 定义、使用:

    import numpy as np
    from hmmlearn import hmm
    
    states = ["box 1", "box 2", "box3"]
    n_states = len(states)
    
    observations = ["red", "white"]
    n_observations = len(observations)
    
    start_probability = np.array([0.2, 0.4, 0.4])
    
    transition_probability = np.array([
      [0.5, 0.2, 0.3],
      [0.3, 0.5, 0.2],
      [0.2, 0.3, 0.5]
    ])
    
    emission_probability = np.array([
      [0.5, 0.5],
      [0.4, 0.6],
      [0.7, 0.3]
    ])
    
    model = hmm.MultinomialHMM(n_components=n_states, n_iter=20, tol=0.001)
    model.startprob_=start_probability
    model.transmat_=transition_probability
    model.emissionprob_=emission_probability
    

    2.2 维特比算法预测状态

    有说法称,其返回结果为ln(prob),文档原文为“the log probability”

    seen = np.array([[0,1,0]]).T
    logprob, box = model.decode(seen, algorithm="viterbi")
    print("The ball picked:", ", ".join(map(lambda x: observations[x], seen)))
    print("The hidden box", ", ".join(map(lambda x: states[x], box)))
    

    输出为

    ('The ball picked:', 'red, white, red')
    ('The hidden box', 'box3, box3, box3')
    

    2.3 计算观测的概率

    print model.score(seen)
    

    输出为

    -2.03854530992
    

    3. 训练与数据准备

    import numpy as np
    from hmmlearn import hmm
    
    states = ["box 1", "box 2", "box3"]
    n_states = len(states)
    
    observations = ["red", "white"]
    n_observations = len(observations)
    model = hmm.MultinomialHMM(n_components=n_states, n_iter=20, tol=0.01)
    
    D1 = [[1], [0], [0], [0], [1], [1], [1]]
    D2 = [[1], [0], [0], [0], [1], [1], [1], [0], [1], [1]]
    D3 = [[1], [0], [0]]
    
    X = numpy.concatenate([D1, D2, D3])
    
    model.fit(X)
    print model.startprob_
    print model.transmat_
    print model.emissionprob_
    print model.score(X)
    
  • 相关阅读:
    数组
    课堂验证性实验总结
    《大道至简》第二章读后感
    大道至简第一章伪代码
    大道至简
    python学习笔记1
    19maven依赖冲突
    18SSM资源整合2
    18SSM资源整合
    17mybatis注解开发
  • 原文地址:https://www.cnblogs.com/esctrionsit/p/13415013.html
Copyright © 2011-2022 走看看