zoukankan      html  css  js  c++  java
  • an introduction to conditional random fields

    1.Structured prediction methods are essentially a combination of classification and graphical modeling.

    2.They combine the ability of graphical models to compactly model multivariate data with the ability of classification methods to perform prediction using large sets of input features.

    3.The input x is divided into feature vectors {x0,x1, . . . ,xT }. Each xs contains various information about the word at position s, such as its identity, orthographic features such as prefixes and suffixes, membership in domain-specific lexicons, and information in semantic databases such as WordNet.

    4.CRFs are essentially a way of combining the advantages of discriminative classification and graphical modeling, combining the ability to compactly model multivariate outputs y with the ability to leverage a large number of input features x for prediction.

    5.The difference between generative models and CRFs is thus exactly analogous to the difference between the naive Bayes and logistic regression classifiers. Indeed, the multinomial logistic regression model can be seen as the simplest kind of CRF, in which there is only one output variable.

    6.The insight of the graphical modeling perspective is that a distribution over very many variables can often be represented as a product of local functions that each depend on a much smaller subset of variables. This factorization turns out to have a close connection to certain conditional independence relationships among the variables — both types of information being easily summarized by a graph. Indeed, this relationship between factorization, conditional independence, and graph structure comprises much of the power of the graphical modeling framework: the conditional independence viewpoint is most useful for designing models, and the factorization viewpoint is most useful for designing inference algorithms.

    7.The principal advantage of discriminative modeling is that it is better suited to including rich, overlapping features.

    8.In principle, it may not be clear why these approaches should be so different, because we can always convert between the two methods using Bayes rule. For example, in the naive Bayes model, it is easy to convert the joint p(y)p(x|y) into a conditional distribution p(y|x). Indeed, this conditional has the same form as the logistic regression model (2.9). And if we managed to obtain a “true” generative model for the data, that is, a distribution p∗(y,x) = p∗(y)p∗(x|y) from which the data were actually sampled, then we could simply compute the true p∗(y|x), which is exactly the target of the discriminative approach. But it is precisely because we never have the true distribution that the two approaches are different in practice. Estimating p(y)p(x|y) first, and then computing the resulting p(y|x) (the generative approach)yields a different estimate than estimating p(y|x) directly. In other words, generative and discriminative models both have the aim of stimating p(y|x), but they get there in different ways.

  • 相关阅读:
    SpringBoot启动过程中,候选类的过滤和加载
    Dubbo发布过程中,扩展点的加载
    Dubbo发布过程中,服务发布的实现
    Dubbo发布过程中,服务端调用过程
    SpringBean加载过程中,循环依赖的问题(一)
    Dubbo发布过程中,消费者的初始化过程
    DiscuzQ构建/发布小程序与H5前端
    Delphi写COM+的心得体会
    DBGridEh导出Excel等格式文件
    数据库直接通过bcp导出xml文件
  • 原文地址:https://www.cnblogs.com/kevinGaoblog/p/3880687.html
Copyright © 2011-2022 走看看