zoukankan      html  css  js  c++  java
  • :2014-11-07:The basic structure of lda-c

    The basic structure of lda-c


    corpus

    • docs[]
    • num_terms :The range or pages of words
    • num_docs :The amount of documents ?? value of word or value of length ?? deepth or range?

    doc

    • words[] :(type:int) An integer representing certain word
    • counts[] :(type:int) The frequency of related word
    • length :The range of words in certain document
    • total :The amount of total words in certain document that is sum of frequency

    lda-model

    • alpha :unknown
    • log_prob_w[NTOPICS][num_terms] log(ss->class_word[k][w]/ss->class_total[k]) prob: distribution of topics ~ words
    • num_topics :(NTOPICS) the amount of topics to be trained
    • num_terms :The range of words

    ss - suffient statistics

    • class_word[NTOPICS][num_terms] prob: 1.0/random()
    • class_total[NTOPICS] :The sum of frequency of related class_word
    • alpha_suffstats
    • num_docs

    var_gamma[docs][NTOPICS]

    doc ~ topics

    phi[max-corpus_length][NTOPICS]

    word ~ topics

  • 相关阅读:
    数据结构之c++感悟
    常见linux系统中RPM包的通用命名规则
    scripts
    http
    iscsi
    RHCE认证经典考题
    数据库
    配置空客户端邮件
    配置nfs服务
    Python版本的7大排序
  • 原文地址:https://www.cnblogs.com/cyno/p/4082478.html
Copyright © 2011-2022 走看看