zoukankan      html  css  js  c++  java
  • 动手学深度学习10- pytorch多层感知机从零实现

    多层感知机

    import torch
    import numpy as np
    import sys
    sys.path.append('..')
    import d2lzh_pytorch as d2l
    

    我们仍然使用Fashion_MNIST数据集,使用多层感知机对图像进行分类

    batch_size = 256
    train_iter,test_iter = d2l.get_fahsion_mnist(batch_size)
    
    定义模型的参数

    Fashion_MNIST数据集汇总的图形的形状为28x28,类别数为10,本节我们依然使用长度为28x28=784的向量表示一张图像,因此输入个数为784,输出个数为10。设置超参数隐藏单元个数为256。

    num_inputs, num_outputs,num_hiddens = 784,10,256
    W1 = torch.tensor(np.random.normal(0,0.01,(num_inputs,num_hiddens)),dtype=torch.float32               
                     )
    b1 = torch.zeros(num_hiddens,dtype=torch.float32)
    W2 = torch.tensor(np.random.normal(0,0.01,(num_hiddens,num_outputs)),dtype=torch.float32
                     )
    b2 = torch.zeros(num_outputs,dtype=torch.float32)
    params = [W1,b1,W2,b2]
    for param in params:
        param.requires_grad_(requires_grad=True)
        
    
    定义激活函数

    我们使用max函数来实现ReLU,不是直接调用relu函数

    def relu(X):
        return torch.max(input=X, other=torch.tensor(0.0))
    
    定义模型

    同softmax回归一样,我们通过view函数将每张原始的图像改成长度为num_inputs的向量。然后我们将实现上一节中多层感知机的计算表达式

    def net(X):
        X = X.view((-1, num_inputs))
        H = relu(torch.matmul(X, W1) + b1)
        return torch.matmul(H, W2) + b2
    
    定义损失函数
    def sgd(params,lr,batch_size):
        for param in params:
    #         param.data -=lr* param.grad/batch_size   
            param.data-= lr* param.grad   # 计算loss使用的是pytorch的交叉熵
    # 这个梯度可以不用除以batch_size,pytorch 在计算loss的时候已经除过一次了,
    
    '''
    mxnet中的softmaxCrossEntropyLoss在反向传播的时候相对于延batch维度求和,
    而pytorch默认的是求平均,所以用pytorch计算得到的loss也比mxnet小很多
    大概得到的mxnet计算得到的1/batch_sie这个量级的,所以反向传播得到的梯度也小得多
    为了得到跟原书差不多的效果,应该吧学习率调成batch_size倍,原书的学习率为0.5,
    设置是100,
    pytorch在计算loss的时候已经除过一次了,这里个的sgd不用除了
    '''
    loss = torch.nn.CrossEntropyLoss()
    
    
    训练模型
    def evaluate_accuracy(data_iter, net):
        acc_sum, n = 0.0, 0
        for X, y in data_iter:
            acc_sum += (net(X).argmax(dim=1) == y).float().sum().item()
            n += y.shape[0]
        return acc_sum / n
    
    def train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size,
                  params=None, lr=None, optimizer=None):
        for epoch in range(num_epochs):
            train_l_sum, train_acc_sum, n = 0.0, 0.0, 0
            for X, y in train_iter:
                y_hat = net(X)
                l = loss(y_hat, y).sum()
    
                # 梯度清零
                if optimizer is not None:
                    optimizer.zero_grad()
                elif params is not None and params[0].grad is not None:
                    for param in params:
                        param.grad.data.zero_()
    
                l.backward()
                if optimizer is None:
                    sgd(params, lr, batch_size)
                else:
                    optimizer.step()  # “softmax回归的简洁实现”一节将用到
    
    
                train_l_sum += l.item()
                train_acc_sum += (y_hat.argmax(dim=1) == y).sum().item()
                n += y.shape[0]
            test_acc = evaluate_accuracy(test_iter, net)
            print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
                  % (epoch + 1, train_l_sum / n, train_acc_sum / n, test_acc))
    
    
    
    num_epochs, lr = 5, 0.5
    train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, params, lr)
    
    epoch 1, loss 0.0031, train acc 0.702, test acc 0.775
    epoch 2, loss 0.0019, train acc 0.821, test acc 0.807
    epoch 3, loss 0.0016, train acc 0.843, test acc 0.831
    epoch 4, loss 0.0015, train acc 0.855, test acc 0.818
    epoch 5, loss 0.0014, train acc 0.863, test acc 0.816
    

    小结

    • 可以通过手动实现定义模型以及其参数来实现简单的多层感知机
    • 多层感知机的层数较多时,这种代码的实现就会明细的繁琐,尤其是在定义模型参数的时候
  • 相关阅读:
    mall
    将UNICODE编码转换为中文
    460. LFU Cache
    957. Prison Cells After N Days
    455. Assign Cookies
    453. Minimum Moves to Equal Array Elements
    434. Number of Segments in a String
    1203. Sort Items by Groups Respecting Dependencies
    641. Design Circular Deque
    441. Arranging Coins
  • 原文地址:https://www.cnblogs.com/onemorepoint/p/11811339.html
Copyright © 2011-2022 走看看