zoukankan html css js c++ java

Deep Learning Tutorial (翻译) 之使用逻辑回归分类手写数字MNIST

英文原文请参考http://www.deeplearning.net/tutorial/logreg.html

模型

这里，我们将使用Theano实现最基本的分类器：逻辑回归，以及学习数学表达式如何映射成Theano图。

逻辑回归是一个基于概率的线性分类器，W和b为参数。通过投射输入向量到一组超平面，每个对应一个类，输入到一个平面的距离反应它属于对应类的概率。

那么输入向量x为i类的概率，数值表示如下：

预测类别为概率最大的类，及：

用Theano实现的代码如下：

 # initialize with 0 the weights W as a matrix of shape (n_in, n_out)
        self.W = theano.shared(
            value=numpy.zeros(
                (n_in, n_out),
                dtype=theano.config.floatX
            ),
            name='W',
            borrow=True
        )
        self.b = theano.shared(
            value=numpy.zeros(
                (n_out),
                dtype=theano.config.floatX
            ),
            name='b',
            borrow=True
        )
        self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)
        self.y_pred = T.argmax(self.p_y_given_x, axis=-1)

模型的参数在训练中维持一个持久的状态，我们将W,b设为共享变量，也是Theano符号变量。

目前定义的模型还没有做任何有用的事情，接下来将介绍如何学习最优参数。

定义损失函数（Loss Function）

对于多类回归，常见的是使用negative log-likelihood作为损失。

在参数θ下，最大化数据集D的似然函数，让我们先定义似然函数和损失：

这里使用随机梯度下降的方法求最小值。

创建逻辑回归类

代码请参考源网址：http://www.deeplearning.net/tutorial/logreg.html

def negative_log_likelihood(self, y):
        '''
        :type y: theano.tensor.TensorType
        :param y: correct label
        :return:
        Note: 我们使用mean而不是sum是为了学习率更少地依赖于batch size
        p_y_given_x是vector类型
        '''
        # y.shape返回y的行数和列数，则y.shape[0]返回y的行数，即样本的总个数，因为一行是一个样本。
        # T.arange(n)，则是产生一组包含[0,1,...,n-1]的向量。
        # T.log(x)，则是对x求对数。记为LP
        # LP[T.arange(y.shape[0]),y]是一组向量，其元素是[ LP[0,y[0]], LP[1,y[1]],
        # LP[2,y[2]], ...,LP[n-1,y[n-1]] ]
        # T.mean(x)，则是求向量x中元素的均值。
        return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])
    def errors(self, y):
        if y.ndim != self.y_pred.ndim:
            raise TypeError('y should have the same shape as self.y_pred',
                            ('y',y.type, 'y_pred', self.y_pred.type))
        if y.dtype.startwith('int'):
            # T.neq(y1, y2)是计算y1与y2对应元素是否相同，如果相同便是0，否则是1。
            # 举例：如果y1=[1,2,3,4,5,6,7,8,9,0] y2=[1,1,3,3,5,6,7,8,9,0]
            # 则，err = T.neq(y1,y2) = [0,1,0,1,0,0,0,0,0,0],其中有3个1，即3个元素不同
            # T.mean()的作用就是求均值。那么T.mean(err) = (0+1+0+1+0+0+0+0+0+0)/10 = 0.3,即误差率为30%
            return T.mean(T.neq(self.y_pred, y))
        else:
            raise NotImplementedError()

训练模型

若要在大多数的编程语言中实现梯度下降算法，需要手动的推导出梯度表达式，这是一个非常麻烦的推导，而且最终结果也很复杂，特别是考虑到数值稳定性的问题的时候。　

然而，在Theano这个工具中，这个变得异常简单。因为它已经把求梯度这种运算给封装好了，不需要手动推导公式，只需要按照格式传入数据即可。

g_W = T.grad(cost=cost, wrt=classifier.W)
g_b = T.grad(cost=cost, wrt=classifier.b)

updates = [(classifier.W, classifier.W - learning_rate * g_W),
               (classifier.b, classifier.b - learning_rate * g_b)]
train_model = theano.function(
        inputs=[index],
        outputs=cost,
        updates=updates,
        givens={
            x: train_set_x[index * batch_size: (index + 1) * batch_size],
            y: train_set_y[index * batch_size: (index + 1) * batch_size]
        }
    )

每一次调用train_model(index)，都会计算并返回输入样本块的cost，然后执行一次MSGD，并更新

As you will see shortly, validate_model is key to our early-stopping implementation .

test_model = thenao.function(
    inputs = [index],
    outputs = classifier.errors(y),
    givens = {
         x: test_set_x[index * batch_size: (index + 1) * batch_size],
         y: test_set_y[index * batch_size: (index + 1) * batch_size]
    }
)
validate_model = theano.function(
    inputs=[index],
    outputs=classifier.errors(y),
    givens={
        x: valid_set_x[index * batch_size: (index + 1) * batch_size],
        y: valid_set_y[index * batch_size: (index + 1) * batch_size]
    }
)

完整代码

略（请参考官方教程）

参考目录

1.深度学习(DL)与卷积神经网络(CNN)学习笔记随笔-03-基于Python的LeNet之LR

2.官方教程

查看全文

相关阅读:
java中的 equals 与 ==
String类的内存分配
 SVN用命令行更换本地副本IP地址
 npoi 设置单元格格式
 net core 微服务框架 Viper 调用链路追踪
 打不死的小强 .net core 微服务快速开发框架 Viper 限流
 net core 微服务快速开发框架 Viper 初体验20201017
Anno 框架增加缓存、限流策略、事件总线、支持 thrift grpc 作为底层传输
 net core 微服务快速开发框架
 Viper 微服务框架编写一个hello world 插件02

原文地址：https://www.cnblogs.com/liwei33/p/5578056.html