Evaluation and Perplexity - 走看看

zoukankan html css js c++ java

Evaluation and Perplexity

Every natural language processing tool has to be evaluated and language models have to be evaluated as well.

There is two method to evaluate the model language.

One is extrinsic evaluation:

The best way of comparing any two two language models, A and B is to put each model in a task, and we'll get accuracy and compare the two accuracy of the two models. But it's time consuming in many cases.

The other is an intrinsic evaluation, most common intrinsic evaluation is called perplexity.

Perplexity happens to be a bad approximation to an extrinsic evaluation unless it turns out that the test data looks a lot like the training data. So generally perplexity is useful only in pilot experiments, but it doesn't help to think about the problem and it's a useful tool as long as we also use extrinsic evaluation as well.

Perplexity is the probability of the test set, normalized by the number of words : PP(W) = (P(w1w2w3...wN))^(-1/N)

we want some nomalizing factor so we can compare test sets of different lengths. The minimizing perplexity is the same as maximizing probability. That perplexity is related to the average branching factor.

For example, if I have ten possible word that can come next and they were all equal probablity, the perplexity will be ten. Let's suppose a sentence consisting of random digits.

PP(W) = P(w1w2...wN)^(-1/N) = (1/10 * 1/10....1/10) ^ (-1 / 10) = 10

Conclusion: Low perplexity = better model

查看全文

相关阅读:
paper 66： MATLAB函数—disp的使用
 paper 65 ：尺度不变特征变换匹配算法[转载]
paper 64：尺度空间（Scale space）理论
 paper 63 ：函数比较：imfilter与fspecial
paper 62：高斯混合模型（GMM）参数优化及实现
 paper 61：计算机视觉领域的一些牛人博客，超有实力的研究机构等的网站链接
 paper 60 ：转载关于视觉SCI期刊
 paper 59：招聘
 paper 58 ：机器视觉学习笔记（1）——OpenCV配置
 paper 57 ：颜色直方图的代码

原文地址：https://www.cnblogs.com/chuanlong/p/3035623.html

Copyright © 2011-2022 走看看