zoukankan      html  css  js  c++  java
  • 神经网络架构PYTORCH-前馈神经网络

       首先要熟悉一下怎么使用PyTorch来实现前馈神经网络吧.为了方便理解,我们这里只拿只有一个隐藏层的前馈神经网络来举例:

    一个前馈神经网络的源码和注释如下:比较简单,这里就不多介绍了.

     1 class NeuralNet(nn.Module):
     2     def __init__(self, input_size, hidden_size, num_classes):
     3         super(NeuralNet, self).__init__()
     4         self.fc1 = nn.Linear(input_size, hidden_size) //输入层
     5         self.relu = nn.ReLU() //隐藏网络:elu的功能是将输入的feature的tensor所有的元素中如果小于零的就取零。
     6         self.fc2 = nn.Linear(hidden_size, num_classes) //输出层
     7 
     8     def forward(self, x):
     9         out = self.fc1(x)
    10         out = self.relu(out)
    11         out = self.fc2(out)
    12         return out

      下面要看一下怎么调用和使用前馈神经网络的:为了提高运算效率,要把该网络优先使用GPU来进行运算.这里的输入尺寸和隐藏尺寸要和训练的图片保持一致的.

    # Device configuration
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    
    model = NeuralNet(input_size, hidden_size, num_classes).to(device)

      为了训练网络,都需要定义一个loss function来描述模型对问题的求解精度。loss越小,代表模型的结果和真实值偏差越小,这里使用CrossEntropyLoss()来计算.Adam,这是一种基于一阶梯度来优化随机目标函数的算法。详细的概念和推导我们后续再专门做分析.

    criterion = nn.CrossEntropyLoss() //针对单目标分类问题, 结合了 nn.LogSoftmax() 和 nn.NLLLoss() 来计算 loss.
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) //优化器,设置学习的速度和使用的模型

      接下来就是训练模型了,训练模型这部分是有点绕的,首先我们来看代码,后面再针对各个函数做说明:

     1 total_step = len(train_loader)
     2 for epoch in range(num_epochs):
     3     for i, (images, labels) in enumerate(train_loader):
     4         # Move tensors to the configured device
     5         images = images.reshape(-1, 28*28).to(device)
     6         labels = labels.to(device)
     7 
     8         # Forward pass
     9         outputs = model(images)
    10         loss = criterion(outputs, labels) 
    11 
    12         # Backward and optimize
    13         optimizer.zero_grad() //把梯度置零,也就是把loss关于weight的导数变成0.
    14         loss.backward()
    15         optimizer.step()

      训练模型,首先把图片矩阵变换成25*25的矩阵单元.其次,把运算参数绑定到特定设备上.

      然后就是网络的前向传播了:

    outputs = model(inputs)

      然后将输出的outputs和原来导入的labels作为loss函数的输入就可以得到损失了:

    loss = criterion(outputs, labels)

      计算得到loss后就要回传损失。要注意的是这是在训练的时候才会有的操作,测试时候只有forward过程。

    loss.backward()

      回传损失过程中会计算梯度,然后需要根据这些梯度更新参数,optimizer.step()就是用来更新参数的。optimizer.step()后,你就可以从optimizer.param_groups[0][‘params’]里面看到各个层的梯度和权值信息。

    optimizer.step()

      测试这个模型,没有梯度的模型,这样就大大大额减少了内存的使用量和运算效率,这个测试模型,其实只有一个关键的语句就可以预测模型了,那就是:_, predicted = torch.max(outputs.data, 1).

    with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
    images = images.reshape(-1, 28*28).to(device)
    labels = labels.to(device)
    outputs = model(images)
    _, predicted = torch.max(outputs.data, 1)
    total += labels.size(0)
    print(labels.size(0))
    correct += (predicted == labels).sum().item()

      这里有个问题.训练好的数据怎么和预测联系起来呢?
    训练输出的outputs也是torch.autograd.Variable格式,得到输出后(网络的全连接层的输出)还希望能到到模型预测该样本属于哪个类别的信息,这里采用torch.max。torch.max()的第一个输入是tensor格式,所以用outputs.data而不是outputs作为输入;第二个参数1是代表dim的意思,也就是取每一行的最大值,其实就是我们常见的取概率最大的那个index;第三个参数loss也是torch.autograd.Variable格式。
      总体源码:

     1 import torch
     2 import torch.nn as nn
     3 import torchvision
     4 import torchvision.transforms as transforms
     5 
     6 
     7 # Device configuration
     8 device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
     9 
    10 # Hyper-parameters 
    11 input_size = 784
    12 hidden_size = 500
    13 num_classes = 10
    14 #input_size = 84
    15 #hidden_size = 50
    16 #num_classes = 2
    17 num_epochs = 5
    18 batch_size = 100
    19 learning_rate = 0.001
    20 
    21 # MNIST dataset 
    22 train_dataset = torchvision.datasets.MNIST(root='../../data',
    23                                            train=True,
    24                                            transform=transforms.ToTensor(),
    25                                            download=True)
    26 
    27 test_dataset = torchvision.datasets.MNIST(root='../../data',
    28                                           train=False,
    29                                           transform=transforms.ToTensor())
    30 
    31 # Data loader
    32 train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
    33                                            batch_size=batch_size,
    34                                            shuffle=True)
    35 
    36 test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
    37                                           batch_size=batch_size,
    38                                           shuffle=False)
    39 
    40 # Fully connected neural network with one hidden layer
    41 class NeuralNet(nn.Module):
    42     def __init__(self, input_size, hidden_size, num_classes):
    43         super(NeuralNet, self).__init__()
    44         self.fc1 = nn.Linear(input_size, hidden_size)
    45         self.relu = nn.ReLU()
    46         self.fc2 = nn.Linear(hidden_size, num_classes)
    47 
    48     def forward(self, x):
    49         out = self.fc1(x)
    50         out = self.relu(out)
    51         out = self.fc2(out)
    52         return out
    53 
    54 model = NeuralNet(input_size, hidden_size, num_classes).to(device)
    55 
    56 # Loss and optimizer
    57 criterion = nn.CrossEntropyLoss()
    58 optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
    59 
    60 # Train the model
    61 total_step = len(train_loader)
    62 for epoch in range(num_epochs):
    63     for i, (images, labels) in enumerate(train_loader):
    64         # Move tensors to the configured device
    65         images = images.reshape(-1, 28*28).to(device)
    66         labels = labels.to(device)
    67 
    68         # Forward pass
    69         outputs = model(images)
    70         loss = criterion(outputs, labels)
    71 
    72         # Backward and optimize
    73         optimizer.zero_grad()
    74         loss.backward()
    75         optimizer.step()
    76 
    77         if (i+1) % 100 == 0:
    78             print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'
    79                    .format(epoch+1, num_epochs, i+1, total_step, loss.item()))
    80 # Test the model
    81 # In test phase, we don't need to compute gradients (for memory efficiency)
    82 with torch.no_grad():
    83     correct = 0
    84     total = 0
    85     for images, labels in test_loader:
    86         images = images.reshape(-1, 28*28).to(device)
    87         labels = labels.to(device)
    88         outputs = model(images)
    89         _, predicted = torch.max(outputs.data, 1)
    90         total += labels.size(0)
    91         #print(predicted)
    92         correct += (predicted == labels).sum().item()
    93 
    94     print('Accuracy of the network on the 10000 test images: {} %'.format(100 * correct / total))
    95 
    96 # Save the model checkpoint
    97 torch.save(model.state_dict(), 'model.ckpt')

      每日一言:人之所畏,不可不畏。

      参考文档:

    1 https://blog.csdn.net/fireflychh/article/details/75516165

    2 https://blog.csdn.net/kgzhang/article/details/77479737

    3 https://github.com/pytorch/tutorials

  • 相关阅读:
    第十篇 .NET高级技术之委托
    第九篇 .NET高级技术ref、out
    文华财经函数大全
    为字段创建全文检索索引
    C#.NET中代码注释提示
    WPF中的资源引用心得
    XAML文件动态加载
    spring MVC找不到JS的问题
    Oracle性能监控脚本
    ExtJs之Ext.data.Store
  • 原文地址:https://www.cnblogs.com/dylancao/p/9876089.html
Copyright © 2011-2022 走看看