zoukankan      html  css  js  c++  java
  • 鱼书学习笔记:神经网络的学习算法

    神经网络的学习步骤如下:

    前提

    神经网络存在合适的权重和偏置,调整权重和偏置以便拟合训练数据的过程称为“学习”。神经网络的学习分成下面4个步骤。

    步骤1(mini-batch)

    从训练数据中随机选出一部分数据,这部分数据称为mini-batch。我们的目标是减小mini-batch的损失函数的值。

    步骤2(计算梯度)

    为了减小mini-batch的损失函数的值,需要求出各个权重参数的梯度。梯度表示损失函数的值减小最多的方向。

    步骤3(更新参数)

    将权重参数沿梯度方向进行微小更新。

    步骤4(重复)

    重复步骤1、步骤2、步骤3

    2层神经网络代码实现

    # coding: utf-8
    import sys, os
    sys.path.append(os.pardir)  # 为了导入父目录的文件而进行的设定
    import numpy as np
    from common.layers import *
    from common.gradient import numerical_gradient
    from collections import OrderedDict
    
    
    class TwoLayerNet:
    
        def __init__(self, input_size, hidden_size, output_size, weight_init_std = 0.01):
            # 初始化权重
            self.params = {}
            self.params['W1'] = weight_init_std * np.random.randn(input_size, hidden_size)
            self.params['b1'] = np.zeros(hidden_size)
            self.params['W2'] = weight_init_std * np.random.randn(hidden_size, output_size) 
            self.params['b2'] = np.zeros(output_size)
    
            # 生成层
            self.layers = OrderedDict()
            self.layers['Affine1'] = Affine(self.params['W1'], self.params['b1'])
            self.layers['Relu1'] = Relu()
            self.layers['Affine2'] = Affine(self.params['W2'], self.params['b2'])
    
            self.lastLayer = SoftmaxWithLoss()
            
        def predict(self, x):
            for layer in self.layers.values():
                x = layer.forward(x)
            
            return x
            
        # x:输入数据, t:监督数据
        def loss(self, x, t):
            y = self.predict(x)
            return self.lastLayer.forward(y, t)
        
        def accuracy(self, x, t):
            y = self.predict(x)
            y = np.argmax(y, axis=1)
            if t.ndim != 1 : t = np.argmax(t, axis=1)
            
            accuracy = np.sum(y == t) / float(x.shape[0])
            return accuracy
            
        # x:输入数据, t:监督数据
        def numerical_gradient(self, x, t):
            loss_W = lambda W: self.loss(x, t)
            
            grads = {}
            grads['W1'] = numerical_gradient(loss_W, self.params['W1'])
            grads['b1'] = numerical_gradient(loss_W, self.params['b1'])
            grads['W2'] = numerical_gradient(loss_W, self.params['W2'])
            grads['b2'] = numerical_gradient(loss_W, self.params['b2'])
            
            return grads
            
        def gradient(self, x, t):
            # forward
            self.loss(x, t)
    
            # backward
            dout = 1
            dout = self.lastLayer.backward(dout)
            
            layers = list(self.layers.values())
            layers.reverse()
            for layer in layers:
                dout = layer.backward(dout)
    
            # 设定
            grads = {}
            grads['W1'], grads['b1'] = self.layers['Affine1'].dW, self.layers['Affine1'].db
            grads['W2'], grads['b2'] = self.layers['Affine2'].dW, self.layers['Affine2'].db
    
            return grads

    学习算法

    # coding: utf-8
    import sys, os
    sys.path.append(os.pardir)
    
    import numpy as np
    from dataset.mnist import load_mnist
    from two_layer_net import TwoLayerNet
    
    # 读入数据
    (x_train, t_train), (x_test, t_test) = load_mnist(normalize=True, one_hot_label=True)
    
    network = TwoLayerNet(input_size=784, hidden_size=50, output_size=10)
    
    iters_num = 10000
    train_size = x_train.shape[0]
    batch_size = 100
    learning_rate = 0.1
    
    train_loss_list = []
    train_acc_list = []
    test_acc_list = []
    
    iter_per_epoch = max(train_size / batch_size, 1)
    
    for i in range(iters_num):
        batch_mask = np.random.choice(train_size, batch_size)
        x_batch = x_train[batch_mask]
        t_batch = t_train[batch_mask]
        
        # 梯度
        #grad = network.numerical_gradient(x_batch, t_batch)
        grad = network.gradient(x_batch, t_batch)
        
        # 更新
        for key in ('W1', 'b1', 'W2', 'b2'):
            network.params[key] -= learning_rate * grad[key]
        
        loss = network.loss(x_batch, t_batch)
        train_loss_list.append(loss)
        
        if i % iter_per_epoch == 0:
            train_acc = network.accuracy(x_train, t_train)
            test_acc = network.accuracy(x_test, t_test)
            train_acc_list.append(train_acc)
            test_acc_list.append(test_acc)
            print(train_acc, test_acc)
  • 相关阅读:
    pgpoolII3.1 的内存泄漏(二)
    iOS 开发的一些网址
    ios开发必备第三方库
    iOS截屏方法
    ios开发第三方库cocoapods安装
    iOS开发知识点总结
    iOS开发文件夹Copy items if needed
    iOS开源库最全的整理
    iOS图标抖动效果
    iOS 加密的3种方法
  • 原文地址:https://www.cnblogs.com/J14nWe1/p/14572709.html
Copyright © 2011-2022 走看看