zoukankan      html  css  js  c++  java
  • Evaluation and Perplexity

    Every natural language processing tool has to be evaluated and language models have to be evaluated as well.

    There is two method to evaluate the model language.

    One is extrinsic evaluation:

    The best way of comparing any two two language models, A and B is to put each model in a task, and we'll get accuracy and compare the two accuracy of the two models. But it's  time consuming in many cases.

    The other is an intrinsic evaluation, most common intrinsic evaluation is called perplexity.

    Perplexity happens to be a bad approximation to an extrinsic evaluation unless it turns out that the test data looks a lot like the training data. So generally perplexity is useful only in pilot experiments, but it doesn't help to think about the problem and it's a useful tool as long as we also use extrinsic evaluation as well.

    Perplexity is the probability of the test set, normalized by the number of words : PP(W) = (P(w1w2w3...wN))^(-1/N)

    we want some nomalizing factor so we can compare test sets of different lengths. The minimizing perplexity is the same as maximizing probability. That perplexity is related to the average branching factor.

    For example, if I have ten possible word that can come next and they were all equal probablity, the perplexity will be ten. Let's suppose a sentence consisting of random digits.

    PP(W) = P(w1w2...wN)^(-1/N) = (1/10 * 1/10....1/10) ^ (-1 / 10) = 10

    Conclusion: Low perplexity = better model

  • 相关阅读:
    Windows消息循环
    python 如何获得网卡的Ip地址
    curl 如何测量它花了多少时间?
    mininet 如何创建有不同带宽的链路
    Emacs学习笔记:多窗口操作
    RYU 如何扔掉一个符合要求的数据包
    RYU OFPMatch 的使用方法
    __attribute__如何使用的记录
    make file 和 GCC标志学习
    mininet and ovs 总结
  • 原文地址:https://www.cnblogs.com/chuanlong/p/3035623.html
Copyright © 2011-2022 走看看