Smoothing: Addone smoothing - 走看看

zoukankan html css js c++ java

Smoothing: Addone smoothing

From the previous blog, I know that there are a lot of zero, which will trigger many questions, such as unpredictability in test data, unavailability of preplexity. So, now we introduce the method smoothing.

previous : P(wi | wi-1) = c(wi-1, wi) / c(wi)

using smoothing: P(wi | wi-1) = ( c(wi-1, wi) + 1 ) / (c(wi-1) + V)

Then we can ensure that the p will not be zero. Now we can estimate this mothed.

We can use the Reconsitituted formula: c(wi-1, wi) = P(wi | wi-1) * c(wi-1) = ( c(wi-1, wi) + 1 ) / (c(wi-1) + V) * c(wi-1).

By using this formula, we can gain the the difference between them as following.

So add-one smoothing makes massive changes to our accounts. In other word, add-one estimation is a very blunt instrument. So in practice we don't actually use add-one smoothing for n-grams. we have better methods. But we do use add-one smoothing for other kinds of NLP models such text classification, or it will be used in similar kinds of domain where the number of zeros isn't so enormous.

查看全文

相关阅读:
微信支付Native扫码支付模式二之CodeIgniter集成篇
 如何使用硬盘安装debian8.3？
使用git将代码push到osc上
 树莓派(Raspberry Pi)搭建简单的lamp服务
 win下修改mysql默认的字符集以防止乱码出现
 CodeIgniter2.2.0-在控制器里调用load失败报错的问题
 Ubuntu Server（Ubuntu 14.04 LTS 64位）安装libgdiplus2.10.9出错问题记录
 linux下mono的安装与卸载
 asp.net中ashx生成验证码代码放在Linux(centos)主机上访问时无法显示问题
 使用NPOI将数据导出为word格式里的table

原文地址：https://www.cnblogs.com/chuanlong/p/3047705.html

Copyright © 2011-2022 走看看