zoukankan      html  css  js  c++  java
  • 从熵谈起

    从香农的信息熵谈其起,再聊聊逻辑回归和softmax;

     softmax loss的梯度求导具体如下(全连接形式):

    更一般的形式:

    前向/反向实现代码如下的两个例子:

    例一:

    class SoftmaxLayer:
        def __init__(self, name='Softmax'):
            pass
    
        def forward(self, in_data):
            shift_scores = in_data - np.max(in_data, axis=1).reshape(-1, 1)                    #在每行中10个数都减去该行中最大的数字
            self.top_val = np.exp(shift_scores) / np.sum(np.exp(shift_scores), axis=1).reshape(-1, 1)
            return self.top_val
    
        def backward(self, residual):
            N = residual.shape[0]
            dscores = self.top_val.copy()
            dscores[range(N), list(residual)] -= 1                                           #loss对softmax层的求导
            dscores /= N
            return dscores

     例二:

    """
        Structured softmax and SVM loss function.
        Inputs have dimension D, there are C classes, and we operate on minibatches
        of N examples.
     
        Inputs:
        - W: A numpy array of shape (D, C) containing weights.
        - X: A numpy array of shape (N, D) containing a minibatch of data.
        - y: A numpy array of shape (N,) containing training labels; y[i] = c means
          that X[i] has label c, where 0 <= c < C.
        
        Returns a tuple of:
        - loss as single float
        - gradient with respect to weights W; an array of same shape as W
        """
    def softmax_loss_vectorized(W, X, y):
     
     
        loss = 0.0
        dW = np.zeros_like(W)
     
        num_train = X.shape[0]
        score = X.dot(W)
        shift_score = score - np.max(score, axis=1, keepdims=True)  # 对数据做了一个平移
        shift_score_exp = np.exp(shift_score)
        shift_score_exp_sum = np.sum(shift_score_exp, axis=1, keepdims=True)
        score_norm = shift_score_exp / shift_score_exp_sum
     
        loss = np.sum(-np.log(score_norm[range(score_norm.shape[0]), y])) / num_train
        
        # dW
        d_score = score_norm
        d_score[range(d_score.shape[0]), y] -= 1
        dW = X.T.dot(score_norm) / num_train 
        return loss, dW

     另外补充:

    (1)交叉熵在pytorch中的应用,nn.CrossEntropyLoss():

     (2)softmax函数求导如下:

  • 相关阅读:
    疫苗玻璃瓶行业,能发生国产替代吗
    马斯克是如何借力打力的?
    什么是分析立体主义
    这5家公司代表了高瓴资本眼中的科技产业未来
    玻尿酸之王华熙生物,为什么要做食品和饮料(下)
    ajax缺点
    babel转码器
    docker 缺陷
    MVVM中的vm双向监听和mvc的缺点
    mybatisPlus中的模糊查询问题
  • 原文地址:https://www.cnblogs.com/zf-blog/p/9005124.html
Copyright © 2011-2022 走看看