深度学习面试题28：标签平滑(Label smoothing)

zoukankan html css js c++ java

深度学习面试题28：标签平滑(Label smoothing)
目录

　　产生背景

　　工作原理

　　参考资料

产生背景

假设选用softmax交叉熵训练一个三分类模型，某样本经过网络最后一层的输出为向量x=(1.0, 5.0, 4.0)，对x进行softmax转换输出为：

假设该样本y=[0, 1, 0]，那损失loss:

按softmax交叉熵优化时，针对这个样本而言，会让0.721越来越接近于1，因为这样会减少loss，但是这有可能造成过拟合。可以这样理解，如果0.721已经接近于1了，那么网络会对该样本十分“关注”，也就是过拟合。我们可以通过标签平滑的方式解决。

以下是论文中对此问题的阐述：

返回目录

工作原理

假设有一批数据在神经网络最后一层的输出值和他们的真实标签

out = np.array([[4.0, 5.0, 10.0], [1.0, 5.0, 4.0], [1.0, 15.0, 4.0]])

y = np.array([[0, 0, 1], [0, 1, 0], [0, 1, 0]])

直接计算softmax交叉熵损失：

res = tf.losses.softmax_cross_entropy(onehot_labels=y, logits=out, label_smoothing=0)

print(tf.Session().run(res))

结果为：0.11191821843385696

使用标签平滑后：

res2 = tf.losses.softmax_cross_entropy(onehot_labels=y, logits=out, label_smoothing=0.001)

print(tf.Session().run(res2))

结果为：0.11647378653287888

可以看出，损失比之前增加了，他的标签平滑的原理是对真实标签做了改变，源码里的公式为：

# new_onehot_labels = onehot_labels * (1 - label_smoothing) + label_smoothing / num_classes

new_onehot_labels = y * (1 - 0.001) + 0.001 / 3

print(y)

print(new_onehot_labels)

[[0 0 1]

[0 1 0]

[0 1 0]]

[[3.33333333e-04 3.33333333e-04 9.99333333e-01]

[3.33333333e-04 9.99333333e-01 3.33333333e-04]

[3.33333333e-04 9.99333333e-01 3.33333333e-04]]

然后使用平滑标签计算softmax交叉熵就能得到最终的结果了，我们也可以验证一下：

res3 = tf.losses.softmax_cross_entropy(onehot_labels=new_onehot_labels, logits=out, label_smoothing=0)

print(tf.Session().run(res3))

结果为：0.11647378653287888

完整代码：

import numpy as np import tensorflow as tf out = np.array([[4.0, 5.0, 10.0], [1.0, 5.0, 4.0], [1.0, 15.0, 4.0]]) y = np.array([[0, 0, 1], [0, 1, 0], [0, 1, 0]]) res = tf.losses.softmax_cross_entropy(onehot_labels=y, logits=out, label_smoothing=0) print(tf.Session().run(res)) res2 = tf.losses.softmax_cross_entropy(onehot_labels=y, logits=out, label_smoothing=0.001) print(tf.Session().run(res2)) # new_onehot_labels = onehot_labels * (1 - label_smoothing) # + label_smoothing / num_classes new_onehot_labels = y * (1 - 0.001) + 0.001 / 3 print(y) print(new_onehot_labels) res3 = tf.losses.softmax_cross_entropy(onehot_labels=new_onehot_labels, logits=out, label_smoothing=0) print(tf.Session().run(res3))

View Code

返回目录

参考资料

Rethinking the Inception Architecture for Computer Vision

标签平滑（Label Smoothing）——分类问题中错误标注的一种解决方法

https://www.datalearner.com/blog/1051561454844661

返回目录
查看全文

相关阅读:
C#中对文件进行选择对话框打开和保存对话框进行复制
 二、RabbitMQ操作
 二、TortoiseSVN 合并、打分支、合并分支、切换分支
 一、Google开发者工具功能页面截图
 一、RabbitMQ安装与测试连接
 二、jquery Try{}catch(e){}
ViewMode
三、MVC_JsonResult类型
 随笔集
 五、SQL Server Profiler追踪工具

原文地址：https://www.cnblogs.com/mfryf/p/11381448.html

深度学习面试题28：标签平滑(Label smoothing)

目录

产生背景

工作原理

参考资料

　　产生背景

　　工作原理

　　参考资料