label smoothing

文章目录[隐藏]

为什么需要 label smoothing
label smoothing
用法

label 可分为 hard label 和 soft label；

label smoothing，标签平滑，像 L1 L2 dropout 一样，是一种正则化的方法；

用于防止模型过分自信，提升泛化能力；

不过这个方法不是很常用，有很多 tricks 可以起到类似的作用，学到最后你会发现，很多算法之间彼此互通。

为什么需要 label smoothing

在分类问题中，常用交叉熵作为 loss

label smoothing

这是二分类的专用公式，其实交叉熵的本质是 log 损失，把上式中的 y 变成 1 即可转化成 log 损失

label smoothing

其中 p 对应 y，q 对应 softmax，如下

label smoothing

传统one-hot编码标签的网络学习过程中，鼓励模型预测为目标类别的概率趋近1，非目标类别的概率趋近0，即最终预测的logits向量中目标类别z的值会趋于无穷大（如果不无穷大，就会有loss），使得模型向预测正确与错误标签的logit差值无限增大的方向学习，

而过大的logit差值会使模型缺乏适应性，对它的预测过于自信。

在训练数据不足以覆盖所有情况下，这就会导致网络过拟合，泛化能力差，而且实际上有些标注数据不一定准确，这时候使用交叉熵损失函数作为目标函数也不一定是最优的了。

label smoothing

合起来这么写

label smoothing

其中K为多分类的类别总个数，α是一个较小的超参数（一般取0.1）；

通俗理解：将真实标签减去一个很小的数，然后平均分配到其他类上，实现了标签软化。

用法

tensorflow

tf.losses.softmax_cross_entropy(
    onehot_labels,
    logits,
    weights=1.0,
    label_smoothing=0,
    scope=None,
    loss_collection=tf.GraphKeys.LOSSES,
    reduction=Reduction.SUM_BY_NONZERO_WEIGHTS
)

pytorch

from torch.nn.modules.loss import _WeightedLoss
## 继承_WeightedLoss类
class SmoothingBCELossWithLogits(_WeightedLoss):
    def __init__(self, weight=None, reduction='mean', smoothing=0.0):
        super(SmoothingBCELossWithLogits, self).__init__(weight=weight, reduction=reduction)
        self.smoothing = smoothing
        self.weight  = weight
        self.reduction = reduction
    @staticmethod
    def _smooth(targets, n_labels, smoothing=0.0):
        assert 0 <= smoothing < 1
        with torch.no_grad():
            targets = targets  * (1 - smoothing) + 0.5 * smoothing
        return targets
    def forward(self, inputs, targets):
        targets = _smooth(targets, inputs.size(-1), self.smoothing)
        loss = F.binary_cross_entropy_with_logits(inputs, targets, self.weights)
        
        if self.reduction == 'sum':
            loss = loss.item()
        elif self.reduction == 'mean':
            loss = loss.mean()
        return loss

参考资料：

https://www.cnblogs.com/irvingluo/p/13873699.html　　标签平滑（Label Smoothing）详解

https://blog.csdn.net/qq_43211132/article/details/100510113　　标签平滑Label Smoothing

https://blog.51cto.com/u_15279692/2943213

https://www.zhihu.com/question/65339831　　神经网络中的label smooth为什么没有火？

原创文章，作者：ItWorker，如若转载，请注明出处：https://blog.ytso.com/tech/pnotes/279361.html

label smoothing

为什么需要 label smoothing

label smoothing

用法

相关推荐

发表回复