zoukankan      html  css  js  c++  java
  • Loss Function

    For binary classification (+1, -1), if we classify correctly then (ycdot f = ycdot heta^Txgt0); otherwise (ycdot f = ycdot heta^Txlt0). Thus we have following loss functions:

    • 0/1 loss
      (min_ hetasum_i L_{0/1}( heta^Tx)). We define (L_{0/1}( heta^Tx) =1) if (ycdot f lt 0), and (=0) o.w. Non convex and very hard to optimize.
    • Hinge loss
      Upper Bound of 0/1 loss. Approximate 0/1 loss by (min_ hetasum_i H( heta^Tx)). We define (H( heta^Tx) = max(0, 1 - ycdot f)). Apparently (H) is small if we classify correctly.
    • Logistic loss
      (min_ heta sum_i log(1+exp(-ycdot heta^Tx))).

    Fortunately, hinge loss, logistic loss and square loss are all convex functions. Convexity ensures global minimum and it's computationally appealing.
    在这里插入图片描述
    Figure 7.5 from Chris Bishop's PRML book. The Hinge Loss E(z) = max(0,1-z) is plotted in blue, the Log Loss in red, the Square Loss in green and the 0/1 error in black.

    From the figure we can observe that the hard instance (near the boundary) will influence the loss function a lot so we need to make the model robust and can deal with the hard ones.

    For binary classification we can unify the two cases (classify correctly or not) by (ycdot f), but for multi-class classification (0, 1, 2, ..., k) we cannot unify all the cases. So we use cross-entropy as the loss.

    There exists a vivid example for transform the target function: If a noisy picture is given, and want to output the clean one. Here the clean one is hard to control so we can let the noise be the target function and wo should minimize the amplitude of the noise. Thus the problem becomes controllable.

  • 相关阅读:
    [USACO17JAN]Subsequence Reversal序列反转
    P1330 封锁阳光大学
    P1403 [AHOI2005]约数研究
    poj1456——Supermarket
    P1807 最长路_NOI导刊2010提高(07)
    P1137 旅行计划
    P1162 填涂颜色
    P1040 加分二叉树
    P1135 奇怪的电梯
    P1086 花生采摘
  • 原文地址:https://www.cnblogs.com/EIMadrigal/p/14530003.html
Copyright © 2011-2022 走看看