zoukankan      html  css  js  c++  java
  • Topic modeling【经典模型】

    http://www.cs.princeton.edu/~blei/topicmodeling.html

    Topic models are a suite of algorithms that uncover the hidden thematic structure in document collections. These algorithms help us develop new ways to search, browse and summarize large archives of texts.

    Below, you will find links to introductory materials, corpus browsers based on topic models, and open source software (from my research group) for topic modeling.

    Introductory materials

    Corpus browsers based on topic models

    The structure uncovered by topic models can be used to explore an otherwise unorganized collection. The following are browsers of large collections of documents, built with topic models.

    Also see Sean Gerrish's discipline browser for an interesting application of topic modeling at JSTOR.

    To build your own browsers, see Allison Chaney's excellent Topic Model Visualization Engine (TMVE). For example, here is a browser of 100,000 Wikipedia articles that uses TMVE.

    Topic modeling software

    Our research group has released many open-source software packages for topic modeling. Please post questions, comments, and suggestions about this code to the topic models mailing list. 

    Link Model/Algorithm Language Author Notes
    lda-c Latent Dirichlet allocation C D. Blei This implements variational inference for LDA.
    class-slda Supervised topic models for classifiation C++ C. Wang Implements supervised topic models with a categorical response.
    lda R package for Gibbs sampling in many models R J. Chang Implements many models and is fast . Supports LDA, RTMs (for networked documents), MMSB (for network data), and sLDA (with a continuous response).
    online lda Online inference for LDA Python M. Hoffman Fits topic models to massive data. The demo downloads random Wikipedia articles and fits a topic model to them.
    online hdp Online inference for the HDP Python C. Wang Fits hierarchical Dirichlet process topic models to massive data. The algorithm determines the number of topics.
    tmve(online) Topic Model Visualization Engine Python A. Chaney A package for creating corpus browsers. See, for example,Wikipedia .
    ctr Collaborative modeling for recommendation C++ C. Wang Implements variational inference for a collaborative topic models. These models recommend items to users based on item content and other users' ratings.
    dtm Dynamic topic models and the influence model C++ S. Gerrish This implements topics that change over time and a model of how individual documents predict that change.
    hdp Hierarchical Dirichlet processes C++ C. Wang Topic models where the data determine the number of topics. This implements Gibbs sampling.
    ctm-c Correlated topic models C D. Blei This implements variational inference for the CTM.
    diln Discrete infinite logistic normal C J. Paisley This implements the discrete infinite logistic normal, a Bayesian nonparametric topic model that finds correlated topics.
    hlda Hierarchical latent Dirichlet allocation C D. Blei This implements a topic model that finds a hierarchy of topics. The structure of the hierarchy is determined by the data.
    turbotopics Turbo topics Python D. Blei Turbo topics find significant multiword phrases in topics.
  • 相关阅读:
    【Cocos2d-X游戏实战开发】捕鱼达人之开发前准备工作(一)
    NetBeans + Xdebug 调试WordPress
    【Cocos2d-X游戏实战开发】捕鱼达人之单例对象的设计(二)
    源代码静态分析工具
    Flash Builder 条件编译的实现
    Maven插件之portable-config-maven-plugin(不同环境打包)
    生成8位随机不重复的数字编号
    【剑指Offer学习】【面试题63:二叉搜索树的第k个结点】
    51nod 1413:权势二进制
    leetcode_Isomorphic Strings _easy
  • 原文地址:https://www.cnblogs.com/heidsoft/p/3874381.html
Copyright © 2011-2022 走看看