zoukankan      html  css  js  c++  java
  • 【深度学习】Pytorch 学习笔记

    学习网址:https://www.youtube.com/watch?v=ogZi5oIo4fI
    有道云笔记:http://note.youdao.com/noteshare?id=d86bd8fc60cb4fe87005a2d2e2d5b70d&sub=6911732F9FA44C68AD53A09072155ED3

    Pytorch Leture 05: Linear Rregression in the Pytorch Way

    第一部分,使用一个类来构建你的模型,需要写forward函数

    import torch
    from torch.autograd import Variable
    import matplotlib.pyplot as plt
    
    x_data = Variable(torch.Tensor([[1.0], [2.0], [3.0]]))
    y_data = Variable(torch.Tensor([[2.0], [4.0], [6.0]]))
    
    
    class Model(torch.nn.Module):
    
        def __init__(self):
            """
            In the constructor we instantiate two nn.Linear module
            """
            super(Model, self).__init__()
            self.linear = torch.nn.Linear(1, 1)  # One in and one out
    
        def forward(self, x):
            """
            In the forward function we accept a Variable of input data and we must return
            a Variable of output data. We can use Modules defined in the constructor as
            well as arbitrary operators on Variables.
            """
            y_pred = self.linear(x)
            return y_pred
    
    # our model
    model = Model()
    

    第二部分,构建loss和优化器来进行参数计算

    
    # Construct our loss function and an Optimizer. The call to model.parameters()
    # in the SGD constructor will contain the learnable parameters of the two
    # nn.Linear modules which are members of the model.
    # criterion 标准准则 主要用来计算loss
    criterion = torch.nn.MSELoss(size_average=False)
    # 优化器
    optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
    

    第三部分,进行训练,forward -> backward -> update parameters

    # Training loop
    for epoch in range(1000):
        # Forward pass: Compute predicted y by passing x to the model
        y_pred = model(x_data)
    
        # Compute and print loss
        loss = criterion(y_pred, y_data)
        print(epoch, loss.data[0])
    
        # Zero gradients, perform a backward pass, and update the weights.
        # initialize the gradients
        optimizer.zero_grad()
        # 反向传递
        loss.backward()
        # 更新优化器中的权重,即model.parrameters
        optimizer.step()
    

    第四部分,测试

    # After training
    hour_var = Variable(torch.Tensor([[4.0]]))
    y_pred = model(hour_var)
    print("predict (after training)",  4, model(hour_var).data[0][0])
    
    

    总结一下基本的训练框架:

    1. 通过写一个类,来构造你的模型
    2. 构建loss和优化器
    3. 开始训练 Forward -> compute loss -> backward -> update
    • Forward: y_pred = model(x_data)
    • Compute loss: loss = criterion(y_pred,y_data)
    • Backward: optimizer.zero_grad() && loss.backward()
    • Update: optimizer.step()

    作业测试其他optimizers:

    • torch.optim.Adagrad
    • torch.optim.Adam
    • torch.optim.Adamax
    • torch.optim.ASGD
    • torch.optim.LBFGS
    • torch.optim.RRRMSprop
    • torch.optim.Rprop
    • torch.optim.SGD

    Logistic Regression 逻辑回归 - 二分类

    原来的:

    graph LR
    
    x-->Linear
    Linear-->y
    
    hat{y} = x * w + b
    
    loss = frac{1}{N}sum_{n=1}^{N}(hat{y_n}-y_n)^2
    

    激活函数:

    using sigmoid functions:

    graph LR
    
    x --> Linear
    Linear --> Sigmoid
    Sigmoid --> y
    

    Y 介于 [0,1] 之间, 这样做可以用来压缩计算量,让计算更加容易

    sigma(z) = frac{1}{1+e^{-z}}
    
    hat{y} = sigma(x*w+b)
    
    loss=-frac{1}{N}sum_{n=1}^{N}y_nloghat{y_n} + (1-y_n)log(1-hat{y_n})
    

    代码:

    import torch
    from torch.autograd import Variable
    import torch.nn.functional as F
    
    x_data = Variable(torch.Tensor([[1.0], [2.0], [3.0], [4.0],[5.0]]))
    y_data = Variable(torch.Tensor([[0.], [0.], [1.], [1.],[1.]]))
    
    
    class Model(torch.nn.Module):
    
        def __init__(self):
            """
            In the constructor we instantiate nn.Linear module
            """
            super(Model, self).__init__()
            self.linear = torch.nn.Linear(1, 1)  # One in and one out
    
        def forward(self, x):
            """
            In the forward function we accept a Variable of input data and we must return
            a Variable of output data.
            """
            y_pred = F.sigmoid(self.linear(x))
            return y_pred
    
    # our model
    model = Model()
    
    
    # Construct our loss function and an Optimizer. The call to model.parameters()
    # in the SGD constructor will contain the learnable parameters of the two
    # nn.Linear modules which are members of the model.
    criterion = torch.nn.BCELoss(size_average=True)
    optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
    
    # Training loop
    for epoch in range(400):
            # Forward pass: Compute predicted y by passing x to the model
        y_pred = model(x_data)
    
        # Compute and print loss
        loss = criterion(y_pred, y_data)
        print(epoch, loss.data[0])
    
        # Zero gradients, perform a backward pass, and update the weights.
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    
    # After training
    hour_var = Variable(torch.Tensor([[0.0]]))
    print("predict 1 hour ", 0.0, model(hour_var).data[0][0] > 0.5)
    hour_var = Variable(torch.Tensor([[7.0]]))
    print("predict 7 hours", 7.0, model(hour_var).data[0][0] > 0.5)
    

    新增激活函数:

    1. Design your model using class

    y_Pred = F.sigmoid(self.linear(x))

    1. Construct loss and optimizer

    change loss into:

    criterion = torch.nn.BCELoss(size_average=True)

    1. Training cycle (forward,backward,update)

    作业:尝试其他激活函数:

    • ReLu

    ReLU是修正线性单元(The Rectified Linear Unit)的简称,近些年来在深度学习中使用得很多,可以解决梯度弥散问题,因为它的导数等于1或者就是0。相对于sigmoid和tanh激励函数,对ReLU求梯度非常简单,计算也很简单,可以非常大程度地提升随机梯度下降的收敛速度。(因为ReLU是线性的,而sigmoid和tanh是非线性的)。但ReLU的缺点是比较脆弱,随着训练的进行,可能会出现神经元死亡的情况,例如有一个很大的梯度流经ReLU单元后,那权重的更新结果可能是,在此之后任何的数据点都没有办法再激活它了。如果发生这种情况,那么流经神经元的梯度从这一点开始将永远是0。也就是说,ReLU神经元在训练中不可逆地死亡了。

    • ReLu6
    • ELU

    ELU在正值区间的值为x本身,这样减轻了梯度弥散问题(x>0区间导数处处为1),这点跟ReLU、Leaky ReLU相似。而在负值区间,ELU在输入取较小值时具有软饱和的特性,提升了对噪声的鲁棒性

    • SELU
    • PReLU
    • LeakyReLu

    Leaky ReLU主要是为了避免梯度消失,当神经元处于非激活状态时,允许一个非0的梯度存在,这样不会出现梯度消失,收敛速度快。它的优缺点跟ReLU类似。

    • Threshold
    • Hardtanh

    tanh函数将输入值压缩至-1到1之间。该函数与Sigmoid类似,也存在着梯度弥散或梯度饱和的缺点。

    • Sigmoid

    这应该是神经网络中使用最频繁的激励函数了,它把一个实数压缩至0到1之间,当输入的数字非常大的时候,结果会接近1,当输入非常大的负数时,则会得到接近0的结果。在早期的神经网络中使用得非常多,因为它很好地解释了神经元受到刺激后是否被激活和向后传递的场景(0:几乎没有被激活,1:完全被激活),不过近几年在深度学习的应用中比较少见到它的身影,因为使用sigmoid函数容易出现梯度弥散或者梯度饱和。当神经网络的层数很多时,如果每一层的激励函数都采用sigmoid函数的话,就会产生梯度弥散的问题,因为利用反向传播更新参数时,会乘以它的导数,所以会一直减小。如果输入的是比较大或者比较小的数(例如输入100,经Sigmoid函数后结果接近于1,梯度接近于0),会产生饱和效应,导致神经元类似于死亡状态。

    • Tanh

    Lecture07: How to make netural network wide and deep ?

    graph LR
    a-->Linear
    b-->Linear
    Linear-->Sigmoid
    Sigmoid-->y
    
    

    多维度,更层次的网络,主要在Design your model using class 中进行的改变

    import torch
    from torch.autograd import Variable
    import numpy as np
    
    xy = np.loadtxt('./data/diabetes.csv.gz', delimiter=',', dtype=np.float32)
    x_data = Variable(torch.from_numpy(xy[:, 0:-1]))
    y_data = Variable(torch.from_numpy(xy[:, [-1]]))
    
    print(x_data.data.shape)
    print(y_data.data.shape)
    
    
    class Model(torch.nn.Module):
    
        def __init__(self):
            """
            In the constructor we instantiate two nn.Linear module
            """
            super(Model, self).__init__()
            self.l1 = torch.nn.Linear(8, 6)
            self.l2 = torch.nn.Linear(6, 4)
            self.l3 = torch.nn.Linear(4, 1)
    
            self.sigmoid = torch.nn.Sigmoid()
    
        def forward(self, x):
            """
            In the forward function we accept a Variable of input data and we must return
            a Variable of output data. We can use Modules defined in the constructor as
            well as arbitrary operators on Variables.
            """
            out1 = self.sigmoid(self.l1(x))
            out2 = self.sigmoid(self.l2(out1))
            y_pred = self.sigmoid(self.l3(out2))
            return y_pred
    
    # our model
    model = Model()
    
    
    # Construct our loss function and an Optimizer. The call to model.parameters()
    # in the SGD constructor will contain the learnable parameters of the two
    # nn.Linear modules which are members of the model.
    #criterion = torch.nn.BCELoss(size_average=True)
    criterion = torch.nn.BCELoss(reduction='elementwise_mean')
    optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
    
    # Training loop
    for epoch in range(1200000):
            # Forward pass: Compute predicted y by passing x to the model
        y_pred = model(x_data)
    
        # Compute and print loss
        loss = criterion(y_pred, y_data)
        print(epoch, loss.item())
    
        # Zero gradients, perform a backward pass, and update the weights.
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    

    作业:

    1. 10层以上的更深层测的网络进行训练
      发现并没有因为更深,效果变好
    2. 更改激励函数

    Lecture 08: Pytorch DataLoader

    构造Datasets主要分为三个过程:

    继承自Dataset

    1. download, rerad data etc
    2. return one item on the index
    3. return the data length

    实例化一个dataset,在Dataloader中使用:

    train_loader = DataLoader(dataset=dataset,
                              batch_size=1,
                              shuffle=True,
                              num_workers=1)
    

    Code:

    # References
    # https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/01-basics/pytorch_basics/main.py
    # http://pytorch.org/tutorials/beginner/data_loading_tutorial.html#dataset-class
    import torch
    import numpy as np
    from torch.autograd import Variable
    from torch.utils.data import Dataset, DataLoader
    
    
    class DiabetesDataset(Dataset):
        """ Diabetes dataset."""
    
        # Initialize your data, download, etc.
        def __init__(self):
            xy = np.loadtxt('./data/diabetes.csv.gz',
                            delimiter=',', dtype=np.float32)
            self.len = xy.shape[0]
            self.x_data = torch.from_numpy(xy[:, 0:-1])
            self.y_data = torch.from_numpy(xy[:, [-1]])
    
        def __getitem__(self, index):
            return self.x_data[index], self.y_data[index]
    
        def __len__(self):
            return self.len
    
    
    dataset = DiabetesDataset()
    train_loader = DataLoader(dataset=dataset,
                              batch_size=1,
                              shuffle=True,
                              num_workers=1)
    
    for epoch in range(2):
        for i, data in enumerate(train_loader, 0):
            # get the inputs
            inputs, labels = data
    
            # wrap them in Variable
            inputs, labels = Variable(inputs), Variable(labels)
    
            # Run your training process
            print(epoch, i, "inputs", inputs.data, "labels", labels.data)
    
    

    课后作业:
    使用其他数据集,MNIST,参考了官网的代码:

    总结一下训练的思路:

    1. 构造继承自Dataset的自己的datasets类
    • [ ] 读取数据集,np.loadtxt("datas.csv") , 构建trainset testset
    • [ ] 构建DataLoader: 得到trainLoader , testLoader
    • [ ] 从DataLoader中获取数据: dataiter = iter(trainloader) images, labels = dataiter.next()
    • [ ] 训练
    • [ ] 测试

    Lecture 09: softmax Classifier

    part one

    MNist softmax

    before:

    graph LR
    
    x{x} --> Linear
    Linear --> Activation
    Activation --> ...
    ... --> Linear2
    Linear2-->Activation2
    Activation2-->h{y}
    

    now:

    graph LR
    
    x{x} --> Linear
    Linear --> Activation
    Activation --> ...
    ... --> Linear2
    Linear2-->Activation2
    Activation2-->P_y=0
    Activation2-->P_y=1
    Activation2-->....
    Activation2-->P_y=10
    

    what is softmax?

    
    sigma(z)_j = frac{e^{z_j}}{sum_{k=1}^{K}e^{z_k}} for j=1,2,...,k
    
    

    using softmax to get probabilities.

    what is corss entropy?

    loss = frac{1}{N}sum_i D(Softmax(wx_i+b),Y_i)
    
    D(hat{Y},Y) = -Yloghat{Y}
    

    整个过程:

    graph LR
    x--LinearModel-->Z
    Z--Softmax-->y'
    y'--Cross_Entropy-->Y
    

    Pytorch中的实现:

    loss = torch.nn.CrossEntropyLoss()
    这个中既包括了Softmax也包括了Cross_Entropy

    graph LR
    X--Softmax-->y'
    y'--Cross_Entropy-->Y
    

    Code:

    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    import torch.optim as optim
    from torchvision import datasets, transforms
    from torch.autograd import Variable
    
    
    # Cross entropy example
    import numpy as np
    # One hot
    # 0: 1 0 0
    # 1: 0 1 0
    # 2: 0 0 1
    Y = np.array([1, 0, 0])
    
    Y_pred1 = np.array([0.7, 0.2, 0.1])
    Y_pred2 = np.array([0.1, 0.3, 0.6])
    print("loss1 = ", np.sum(-Y * np.log(Y_pred1)))
    print("loss2 = ", np.sum(-Y * np.log(Y_pred2)))
    
    ################################################################################
    
    # Softmax + CrossEntropy (logSoftmax + NLLLoss)
    loss = nn.CrossEntropyLoss()
    
    # target is of size nBatch
    # each element in target has to have 0 <= value < nClasses (0-2)
    # Input is class, not one-hot
    Y = Variable(torch.LongTensor([0]), requires_grad=False)
    
    # input is of size nBatch x nClasses = 1 x 4
    # Y_pred are logits (not softmax)
    Y_pred1 = Variable(torch.Tensor([[2.0, 1.0, 0.1]]))
    Y_pred2 = Variable(torch.Tensor([[0.5, 2.0, 0.3]]))
    
    l1 = loss(Y_pred1, Y)
    l2 = loss(Y_pred2, Y)
    
    print("PyTorch Loss1 = ", l1.data, "
    PyTorch Loss2=", l2.data)
    
    print("Y_pred1=", torch.max(Y_pred1.data, 1)[1])
    print("Y_pred2=", torch.max(Y_pred2.data, 1)[1])
    
    
    ################################################################################
    """Batch loss"""
    # target is of size nBatch
    # each element in target has to have 0 <= value < nClasses (0-2)
    # Input is class, not one-hot
    Y = Variable(torch.LongTensor([2, 0, 1]), requires_grad=False)
    
    # input is of size nBatch x nClasses = 2 x 4
    # Y_pred are logits (not softmax)
    Y_pred1 = Variable(torch.Tensor([[0.1, 0.2, 0.9],
                                     [1.1, 0.1, 0.2],
                                     [0.2, 2.1, 0.1]]))
    
    
    Y_pred2 = Variable(torch.Tensor([[0.8, 0.2, 0.3],
                                     [0.2, 0.3, 0.5],
                                     [0.2, 0.2, 0.5]]))
    
    l1 = loss(Y_pred1, Y)
    l2 = loss(Y_pred2, Y)
    
    print("Batch Loss1 = ", l1.data, "
    Batch Loss2=", l2.data)
    
    

    作业:CrossEntropyLoss VS NLLLoss ?

    part two : real problem - MNIST input

    MNIST Network

    graph LR
    inputLayer -.-> HiddenLayer
    HiddenLayer -.-> OutputLayer
    

    Code:

    # https://github.com/pytorch/examples/blob/master/mnist/main.py
    from __future__ import print_function
    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    import torch.optim as optim
    from torchvision import datasets, transforms
    from torch.autograd import Variable
    
    # Training settings
    batch_size = 16
    
    # MNIST Dataset
    train_dataset = datasets.MNIST(root='./mnist_data/',
                                   train=True,
                                   transform=transforms.ToTensor(),
                                   download=True)
    
    test_dataset = datasets.MNIST(root='./mnist_data/',
                                  train=False,
                                  transform=transforms.ToTensor())
    
    # Data Loader (Input Pipeline)
    train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                               batch_size=batch_size,
                                               shuffle=True)
    
    test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                              batch_size=batch_size,
                                              shuffle=False)
    
    
    class Net(nn.Module):
    
        def __init__(self):
            super(Net, self).__init__()
            self.l1 = nn.Linear(784, 520)
            self.l2 = nn.Linear(520, 320)
            self.l3 = nn.Linear(320, 240)
            self.l4 = nn.Linear(240, 120)
            self.l5 = nn.Linear(120, 10)
    
        def forward(self, x):
            x = x.view(-1, 784)  # Flatten the data (n, 1, 28, 28)-> (n, 784)
            x = F.relu(self.l1(x))
            x = F.relu(self.l2(x))
            x = F.relu(self.l3(x))
            x = F.relu(self.l4(x))
            return self.l5(x)
    
    
    model = Net()
    
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
    
    
    def train(epoch):
        model.train()
        for batch_idx, (data, target) in enumerate(train_loader):
            data, target = Variable(data), Variable(target)
            optimizer.zero_grad()
            output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()
            if batch_idx % 10 == 0:
                print('Train Epoch: {} [{}/{} ({:.0f}%)]	Loss: {:.6f}'.format(
                    epoch, batch_idx * len(data), len(train_loader.dataset),
                    100. * batch_idx / len(train_loader), loss.data[0]))
    
    
    def test():
        model.eval()
        test_loss = 0
        correct = 0
        for data, target in test_loader:
            data, target = Variable(data, volatile=True), Variable(target)
            output = model(data)
            # sum up batch loss
            test_loss += criterion(output, target).data[0]
            # get the index of the max
            pred = output.data.max(1, keepdim=True)[1]
            correct += pred.eq(target.data.view_as(pred)).cpu().sum()
    
        test_loss /= len(test_loader.dataset)
        print('
    Test set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)
    '.format(
            test_loss, correct, len(test_loader.dataset),
            100. * correct / len(test_loader.dataset)))
    
    
    for epoch in range(1, 10):
        train(epoch)
        test()
    

    作业:

    Use DataLoader

    Lecture 10 : basic CNN

    Simple convolution layer

    for Example:

    graph LR
    3*3*1_image-->2*2*1_filter_W
    3*3*1_image-->1*1_Stride
    3*3*1_image-->NoPadding
    NoPadding-->2*2_featureMap
    2*2*1_filter_W-->2*2_featureMap
    1*1_Stride-->2*2_featureMap
    

    How to compute multi-dimension pictures ?

    • 32 * 32 * 3 image
    • 5 * 5 * 3 filter W
    w^T + b
    

    Get: 28 * 28 * 1 feature map * N (how many filters you used)

    计算公式

    
    OutputSize = frac{(InputSize+PaddingSize*2-FilterSize)}{Stride} + 1
    
    

    几个需要解释的参数:

    CONV

    卷积层,需要配合激活函数使用
    filter and padding and filterSize using function above to calculate

    torch.nn.Conv2d(in_channels,out_channels,kernel_size)

    self.conv1=nn.Conv2d(1,10,kernel_size=5)

    激活函数

    activate functions

    Max Pooling

    选取一个n*m的Filter中最大的值作为pooling的结果
    还有类似的avg Pooling

    nn.MaxPool2d(kernel_size)

    self.mp = nn.MaxPool2d(2)

    全连接层

    self.fc = nn.Linear(320,10)

    CNN & Fully Connected network 区别

    CNN中的神经元不是跟每个像素都相连

    Fully Connected network中的神经元是跟每个像素都相连。

    implement of Simple CNN

    graph TB
    
    
    ConvolutionalLayer1 --> PoolingLayer1
    PoolingLayer1 --> ConvolutionalLayer2
    ConvolutionalLayer2 --> PoolingLayer2
    PoolingLayer2 --> Fully-ConnectedLayer
    
    

    Model:

    class Net(nn.Module):
        def __init__(self):
            super(Net,self).__init__()
            self.conv1 = nn.Conv2d(1,10,kernel_size=5)
            self.conv2 = nn.Conv2d(10,20,kernel_size=5)
            self.mp = nn.MaxPool2d(2)
            self.fc = nn.Linear(???,10)
        def forward(self,x):
            in_size = x.size(0)
            x = F.relu(self.mp(self.conv1(x)))
            x = F.relu(self.mp(self.conv2(x)))
            x = x.view(in_size,-1) # flatten the tensor
            x = self.fc(x)
            return F.log_softmax(x)
    

    ??? 处如何填写

    • ??? 处可以随意先填一个数值,然后通过程序的报错来填写
    • 还可以在forward函数中print(x.size())得到tensor的维度

    作业:

    尝试更深层次的网络,更深的全连接层

    Lecture 11 Advanced CNN

    Why 1*1 convolution ?

    using 32 1*1 filters to turn 64-dimension pic into 32-dimension pic.

    using 1*1 filters can significantly save our computations.

    Inception Module

    graph LR
    
    Filter_concat_in --> 1*1Conv0_16
    Filter_concat_in --> 1*1Conv1_16
    Filter_concat_in --> 1*1Conv2_16
    Filter_concat_in --> AvgPooling
    AvgPooling --> 1*1Conv3_16
    1*1Conv0_16 --> 3*3Conv0_24
    3*3Conv0_24 --> 3*3Conv1_24
    3*3Conv1_24 --> Filter_Concat_out
    1*1Conv1_16 --> 5*5Conv_24
    5*5Conv_24 --> Filter_Concat_out
    1*1Conv3_16 --> Filter_Concat_out
    1*1Conv2_16 --> Filter_Concat_out
    
    

    Implement

    1. 最下边的实现(第四道)
    self.brach1x1 = nn.Conv2d(in_channels,16,kernel_size=1)
    
    branch1x1 = self.branch1x1(x)
    
    1. 倒数第二道
    self.branch_pool = nn.Conv2d(in_channels,24,kernel_size=1)
    
    branch_pool = F.avg_pool2d(x,kernel_size=3,stride=1,padding=1)
    branch_pool = self.branch_pool(branch_pool)
    
    1. 正数第二道
    self.branch5x5_1 = nn.Conv2d(in_channels,16,kernel_size=1)
    self.branch5x5_2 = nn.Conv2d(16,24,kernel_size=1,padding=2)
    
    branch5x5 = self.branch5x5_1(x)
    branch5x5 = self.branch5x5_2(branch5x5)
    
    1. 第一道
    self.branch3x3_1=nn.Conv2d(in_channels,16,kernel_size=1)
    self.branch3x3_2=nn.Conv2d(16,24,kernel_size=3,padding=1)
    self.branch3x3_3=nn.Conv2d(24,24,kernel_size=3,padding=1)
    
    branch3x3 = self.branch3x3_1(x)
    branch3x3 = self.branch3x3_2(branch3x3)
    branch3x3 = self.branch3x3_3(branch3x3)
    
    1. output
    outputs = [branch1x1,branch_pool,branch5x5,branch3x3]
    

    ALL CODE:

    # https://github.com/pytorch/examples/blob/master/mnist/main.py
    from __future__ import print_function
    import argparse
    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    import torch.optim as optim
    from torchvision import datasets, transforms
    from torch.autograd import Variable
    
    # Training settings
    batch_size = 64
    
    # MNIST Dataset
    train_dataset = datasets.MNIST(root='./data/',
                                   train=True,
                                   transform=transforms.ToTensor(),
                                   download=True)
    
    test_dataset = datasets.MNIST(root='./data/',
                                  train=False,
                                  transform=transforms.ToTensor())
    
    # Data Loader (Input Pipeline)
    train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                               batch_size=batch_size,
                                               shuffle=True)
    
    test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                              batch_size=batch_size,
                                              shuffle=False)
    
    
    class InceptionA(nn.Module):
    
        def __init__(self, in_channels):
            super(InceptionA, self).__init__()
            self.branch1x1 = nn.Conv2d(in_channels, 16, kernel_size=1)
    
            self.branch5x5_1 = nn.Conv2d(in_channels, 16, kernel_size=1)
            self.branch5x5_2 = nn.Conv2d(16, 24, kernel_size=5, padding=2)
    
            self.branch3x3dbl_1 = nn.Conv2d(in_channels, 16, kernel_size=1)
            self.branch3x3dbl_2 = nn.Conv2d(16, 24, kernel_size=3, padding=1)
            self.branch3x3dbl_3 = nn.Conv2d(24, 24, kernel_size=3, padding=1)
    
            self.branch_pool = nn.Conv2d(in_channels, 24, kernel_size=1)
    
        def forward(self, x):
            branch1x1 = self.branch1x1(x)
    
            branch5x5 = self.branch5x5_1(x)
            branch5x5 = self.branch5x5_2(branch5x5)
    
            branch3x3dbl = self.branch3x3dbl_1(x)
            branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
            branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)
    
            branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
            branch_pool = self.branch_pool(branch_pool)
    
            outputs = [branch1x1, branch5x5, branch3x3dbl, branch_pool]
            return torch.cat(outputs, 1)
    
    
    class Net(nn.Module):
    
        def __init__(self):
            super(Net, self).__init__()
            self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
            self.conv2 = nn.Conv2d(88, 20, kernel_size=5)
    
            self.incept1 = InceptionA(in_channels=10)
            self.incept2 = InceptionA(in_channels=20)
    
            self.mp = nn.MaxPool2d(2)
            self.fc = nn.Linear(1408, 10)
    
        def forward(self, x):
            in_size = x.size(0)
            x = F.relu(self.mp(self.conv1(x)))
            x = self.incept1(x)
            x = F.relu(self.mp(self.conv2(x)))
            x = self.incept2(x)
            x = x.view(in_size, -1)  # flatten the tensor
            x = self.fc(x)
            return F.log_softmax(x)
    
    
    model = Net()
    
    optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
    
    
    def train(epoch):
        model.train()
        for batch_idx, (data, target) in enumerate(train_loader):
            data, target = Variable(data), Variable(target)
            optimizer.zero_grad()
            output = model(data)
            loss = F.nll_loss(output, target)
            loss.backward()
            optimizer.step()
            if batch_idx % 10 == 0:
                print('Train Epoch: {} [{}/{} ({:.0f}%)]	Loss: {:.6f}'.format(
                    epoch, batch_idx * len(data), len(train_loader.dataset),
                    100. * batch_idx / len(train_loader), loss.data[0]))
    
    
    def test():
        model.eval()
        test_loss = 0
        correct = 0
        for data, target in test_loader:
            data, target = Variable(data, volatile=True), Variable(target)
            output = model(data)
            # sum up batch loss
            test_loss += F.nll_loss(output, target, size_average=False).data[0]
            # get the index of the max log-probability
            pred = output.data.max(1, keepdim=True)[1]
            correct += pred.eq(target.data.view_as(pred)).cpu().sum()
    
        test_loss /= len(test_loader.dataset)
        print('
    Test set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)
    '.format(
            test_loss, correct, len(test_loader.dataset),
            100. * correct / len(test_loader.dataset)))
    
    
    for epoch in range(1, 10):
        train(epoch)
        test()
    
    

    Lecture 12: RNN

    Recurrrent NN

    graph LR
    
    X1 --> A1
    A1 --> h1
    
    X2 --> A2
    A2 --> h2
    
    X3 --> A3
    A3 --> h3
    
    X4 --> A4
    A4 --> h4
    
    A1 --> A2
    A2 --> A3
    A3 --> A4
    

    Pytorch提供了RNN函数,可以直接使用

    different RNN implementations

    cell = nn.RNN(input_size=4,hidden_size=2,batch_first=True)
    cell = nn.GRU(input_size=4,hidden_size=2,batch_first=True)
    cell = nn.LSTM(input_size=4,hidden_size=2,batch_first=True)
    

    How to use RNN?

    cell = nn.RNN(input_size=4,hidden_size=2,batch_first=True)
    
    
    inputs = ... # batch_size, seq_len,inputSize
    hidden = (...) # numLayers,batch_size, hidden_size
    
    out, hidden = cell(inputs,hidden)
    

    有两个输出,一个是output, 一个是hidden layer的output

    # Lab 12 RNN
    import sys
    import torch
    import torch.nn as nn
    from torch.autograd import Variable
    
    torch.manual_seed(777)  # reproducibility
    #            0    1    2    3    4
    idx2char = ['h', 'i', 'e', 'l', 'o']
    
    # Teach hihell -> ihello
    x_data = [0, 1, 0, 2, 3, 3]   # hihell
    one_hot_lookup = [[1, 0, 0, 0, 0],  # 0
                      [0, 1, 0, 0, 0],  # 1
                      [0, 0, 1, 0, 0],  # 2
                      [0, 0, 0, 1, 0],  # 3
                      [0, 0, 0, 0, 1]]  # 4
    
    y_data = [1, 0, 2, 3, 3, 4]    # ihello
    x_one_hot = [one_hot_lookup[x] for x in x_data]
    
    # As we have one batch of samples, we will change them to variables only once
    inputs = Variable(torch.Tensor(x_one_hot))
    labels = Variable(torch.LongTensor(y_data))
    
    num_classes = 5
    input_size = 5  # one-hot size
    hidden_size = 5  # output from the RNN. 5 to directly predict one-hot
    batch_size = 1   # one sentence
    sequence_length = 1  # One by one
    num_layers = 1  # one-layer rnn
    
    
    class Model(nn.Module):
    
        def __init__(self):
            super(Model, self).__init__()
            self.rnn = nn.RNN(input_size=input_size,
                              hidden_size=hidden_size, batch_first=True)
    
        def forward(self, hidden, x):
            # Reshape input (batch first)
            x = x.view(batch_size, sequence_length, input_size)
    
            # Propagate input through RNN
            # Input: (batch, seq_len, input_size)
            # hidden: (num_layers * num_directions, batch, hidden_size)
            out, hidden = self.rnn(x, hidden)
            return hidden, out.view(-1, num_classes)
    
        def init_hidden(self):
            # Initialize hidden and cell states
            # (num_layers * num_directions, batch, hidden_size)
            return Variable(torch.zeros(num_layers, batch_size, hidden_size))
    
    
    # Instantiate RNN model
    model = Model()
    print(model)
    
    # Set loss and optimizer function
    # CrossEntropyLoss = LogSoftmax + NLLLoss
    criterion = nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=0.1)
    
    # Train the model
    for epoch in range(100):
        optimizer.zero_grad()
        loss = 0
        hidden = model.init_hidden()
    
        sys.stdout.write("predicted string: ")
        for input, label in zip(inputs, labels):
            # print(input.size(), label.size())
            hidden, output = model(hidden, input)
            val, idx = output.max(1)
            sys.stdout.write(idx2char[idx.data[0]])
            loss += criterion(output, label)
    
        print(", epoch: %d, loss: %1.3f" % (epoch + 1, loss.data[0]))
    
        loss.backward()
        optimizer.step()
    
    print("Learning finished!")
    
  • 相关阅读:
    内核学习<1>
    守护进程(Daemon进程)
    内核模块版本和内核版本不一致的处理方法
    下载,安装 Source Navigator(ubuntu 14.04)
    最新android NDK 下载地址 for Windows
    HTML5初学一 随机的骰子
    系统自带视频
    网络接口log打印
    recyclerView嵌套recycleView
    冒泡循环
  • 原文地址:https://www.cnblogs.com/pprp/p/9748901.html
Copyright © 2011-2022 走看看