zoukankan      html  css  js  c++  java
  • 卷积神经网络的简单可视化

    卷积神经网络的简单可视化


    本次将进行卷积神经网络权重的简单可视化。

    在本篇教程的前半部分,我们会首先定义一个及其简单的 CNN 模型,并手工指定一些过滤器权重参数,作为卷积核参数。

    后半部分,我们会使用 FashionMNIST 数据集,并且定义一个 2 层的 CNN 模型,将模型训练至准确率在 85% 以上,再进行模型卷积核的可视化。

    1. 简单卷积网络模型的可视化

    1.1 指定过滤器卷积层的可视化

    在下面的练习中,我们将手动定义几个类似索比尔算子的过滤器,并将它们指定给一个极其简单地卷积神经网络模型。然后可视化卷积层 4 个过滤器的输出(即 feature maps)。

    加载目标图像

    import cv2
    import matplotlib.pyplot as plt
    %matplotlib inline
    
    img_path = 'images/udacity_sdc.png'
    bgr_img = cv2.imread(img_path)
    
    gray_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2GRAY)
    gray_img = gray_img.astype("float32")/255
    
    plt.imshow(gray_img, cmap='gray')
    plt.show()
    

    手动定义过滤器

    import numpy as np
    
    filter_vals = np.array([[-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1]])
    
    # 变化产生更丰富的过滤器
    filter_1 = filter_vals
    filter_2 = -filter_1
    filter_3 = filter_1.T
    filter_4 = -filter_3
    filters = np.array([filter_1, filter_2, filter_3, filter_4])
    
    fig = plt.figure(figsize=(10, 5))
    for i in range(4):
        ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])
        ax.imshow(filters[i], cmap='gray')
        ax.set_title('Filter %s' % str(i+1))
        width, height = filters[i].shape
        for x in range(width):
            for y in range(height):
                ax.annotate(str(filters[i][x][y]), xy=(y,x),
                           horizontalalignment='center',
                           verticalalignment='center', 
                           color='white' if filters[i][x][y] < 0 else 'black')
    

    定义简单卷积神经网络

    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    
    class Net(nn.Module):
        def __init__(self, weight):
            super(Net, self).__init__()
            k_height, k_width = weight.shape[2:]
            self.conv = nn.Conv2d(1, 4, kernel_size=(k_height, k_width), bias=False)
            self.conv.weight = torch.nn.Parameter(weight)
            self.pool = nn.MaxPool2d(4,4)
            
        def forward(self, x):
            conv_x = self.conv(x)
            activated_x = F.relu(conv_x)
            pooled_x = self.pool(activated_x)
            
            return conv_x, activated_x, pooled_x
        
    # filters 的大小为 4 4 4
    # weight 的大小被增加为 4 1 4 4,1 的维度是针对输入的一个通道
    weight = torch.from_numpy(filters).unsqueeze(1).type(torch.FloatTensor)
    model = Net(weight)
    
    print('Filters shape: ', filters.shape)
    print('weights shape: ', weight.shape)
    print(model)
    
    Filters shape:  (4, 4, 4)
    weights shape:  torch.Size([4, 1, 4, 4])
    Net(
      (conv): Conv2d(1, 4, kernel_size=(4, 4), stride=(1, 1), bias=False)
      (pool): MaxPool2d(kernel_size=4, stride=4, padding=0, dilation=1, ceil_mode=False)
    )
    

    可视化卷积输出

    定义一个函数 viz_layer,在这个方法可以可视化某一层卷积的输出。

    def viz_layer(layer, n_filters=4):
        fig = plt.figure(figsize=(20, 20))
        
        for i in range(n_filters):
            ax = fig.add_subplot(1, n_filters, i+1, xticks=[], yticks=[])
            ax.imshow(np.squeeze(layer[0,i].data.numpy()), cmap='gray')
            ax.set_title('Output %s' % str(i+1))
    
    # 输出原图
    plt.imshow(gray_img, cmap='gray')
    # 格式化输出过滤器(卷积核)
    fig = plt.figure(figsize=(12, 6))
    fig.subplots_adjust(left=0, right=1.5, bottom=0.8, top=1, hspace=0.05, wspace=0.05)
    for i in range(4):
        ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])
        ax.imshow(filters[i], cmap='gray')
        ax.set_title('Filter %s' % str(i+1))
        
    # 为 gray img 添加 1 个 batch 维度,以及 1 个 channel 维度,并转化为 tensor
    gray_img_tensor = torch.from_numpy(gray_img).unsqueeze(0).unsqueeze(1)
    print(gray_img.shape)
    print(gray_img_tensor.shape)
    
    # 将输入图传入模型,获得输出
    conv_layer, activated_layer, pooled_layer = model(gray_img_tensor)
    
    # 可视化卷积输出
    viz_layer(conv_layer)
    
    (213, 320)
    torch.Size([1, 1, 213, 320])
    

    # 可视化卷积后激活函数后的输出
    viz_layer(activated_layer)
    

    1.2 指定过滤器池化层的可视化

    下面可视化池化层后的输出。

    # 可视化池化层后的输出
    viz_layer(pooled_layer)
    

    2. 多层卷积网络模型的可视化

    在下面的练习中,我们将定义一个相对复杂点的神经网络,并使用 FashionMNIST 数据集训练至 85% 以上的准确率,其后再对神经网络进行可视化分析。

    2.1 加载 FashionMNIST 数据集

    FashionMNIST 相当于一种对 MNIST 数据集的升级。MNIST 数据集的数字识别在目前来说,模式比较简单,可能作为深度神经网络模型的目标数据集稍显简单。FashionMNIST 将图像内容变为“时尚衣物”,图像格式不变,使用起来几乎与 MNIST 无异,且比 MNIST 更能考验模型对数据模式的学习能力。

    FashionMNIST 的类别列表:

    0:T-shirt/top(T恤) 
    1:Trouser(裤子) 
    2:Pullover(套衫) 
    3:Dress(裙子) 
    4:Coat(外套) 
    5:Sandal(凉鞋) 
    6:Shirt(汗衫) 
    7:Sneaker(运动鞋) 
    8:Bag(包) 
    

    加载 FashionMNIST 数据集

    import torch
    import torchvision
    
    from torchvision.datasets import FashionMNIST
    from torch.utils.data import DataLoader
    from torchvision import transforms
    
    data_transform = transforms.ToTensor()
    
    train_data = FashionMNIST(root='./data', train=True,
                             download=False, transform=data_transform)
    test_data = FashionMNIST(root='./data', train=False,
                             download=False, transform=data_transform)
    
    # Print out some stats about the training and test data
    print('Train data, number of images: ', len(train_data))
    print('Test data, number of images: ', len(test_data))
    
    Train data, number of images:  60000
    Test data, number of images:  10000
    

    创建数据加载器

    batch_size = 20
    
    train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
    test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=True)
    
    # specify the image classes
    classes = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
    

    可视化目标数据集的部分数据

    import numpy as np
    import matplotlib.pyplot as plt
    
    %matplotlib inline
    
    dataiter = iter(train_loader)
    images, labels = dataiter.next()
    images = images.numpy()
    
    # plot the images in the batch, along with the corresponding labels
    fig = plt.figure(figsize=(25, 4))
    for idx in np.arange(batch_size):
        ax = fig.add_subplot(2, batch_size/2, idx+1, xticks=[], yticks=[])
        ax.imshow(np.squeeze(images[idx]), cmap='gray')
        ax.set_title(classes[labels[idx]])#### 加载 FashionMNIST 数据集
    

    2.2 训练多层卷积模型

    定义模型

    下面定义一个具有两层卷积的模型,加入的 dropout 在一定程度上起到防止过拟合的作用。

    import torch.nn as nn
    import torch.nn.functional as F
    
    class Net(nn.Module):
        def __init__(self):
            super(Net, self).__init__()
            
            self.conv1 = nn.Conv2d(1, 16, 3, padding=1)
            self.pool1 = nn.MaxPool2d(2, 2)
            self.conv2 = nn.Conv2d(16, 32, 3, padding=1)
            self.pool2 = nn.MaxPool2d(2, 2)
            self.activation_l = nn.ReLU()
            
            self.fc = nn.Linear(32 * 7 * 7, 24)
            self.out = nn.Linear(24, 10)
            self.dropout = nn.Dropout(p=0.5)
            self.activation_out = nn.Softmax(dim=1)
            
        def forward(self, x):
            x = self.activation_l(self.conv1(x))
            x = self.pool1(x)
            x = self.activation_l(self.conv2(x))
            x = self.pool2(x)
            
            x = x.view(x.size(0), -1)
            x = self.activation_l(self.fc(x))
            x = self.dropout(x)
            x = self.activation_out(self.out(x))
            
            return x
    

    训练模型

    import torch.optim as optim
    
    criterion = nn.CrossEntropyLoss()
    
    optimizer = torch.optim.Adam(net.parameters())
    
    def train(n_epochs):
        for epoch in range(n_epochs):
            running_loss = 0.0
            for batch_i, data in enumerate(train_loader):
                inputs, labels = data
                optimizer.zero_grad()
                outputs = net(inputs)
                loss = criterion(outputs, labels)
                loss.backward()
                optimizer.step()
                
                running_loss += loss.item()
                
                if batch_i % 1000 == 999:
                    print('Epoch: {}, Batch: {}, Avg. Loss: {}'.format(epoch + 1, batch_i+1, running_loss/1000))
                    running_loss = 0.0
                    
        print('Finished Training')
        
    n_epochs = 10
    
    train(n_epochs)
    
    model_dir = 'saved_models/'
    model_name = 'model_best.pt'
    
    torch.save(net.state_dict(), model_dir+model_name)
    

    加载训练的模型

    net = Net()
    
    net.load_state_dict(torch.load('saved_models/model_best.pt'))
    
    print(net)
    
    Net(
      (conv1): Conv2d(1, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (activation_l): ReLU()
      (fc): Linear(in_features=1568, out_features=24, bias=True)
      (out): Linear(in_features=24, out_features=10, bias=True)
      (dropout): Dropout(p=0.5)
      (activation_out): Softmax()
    )
    

    在测试数据集上测试模型

    test_loss = torch.zeros(1)
    class_correct = list(0. for i in range(10))
    class_total = list(0. for i in range(10))
    
    print(class_correct)
    print(test_loss)
    
    [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
    tensor([ 0.])
    
    net.eval()
    
    criterion = torch.nn.CrossEntropyLoss()
    
    for batch_i, data in enumerate(test_loader):
        inputs, labels = data
        output = net(inputs)
        loss = criterion(outputs, labels)
        
        # update average test loss 
        test_loss = test_loss + ( (torch.ones(1) / (batch_i+1)) * (loss.data - test_loss) )
        
        _, predicted = torch.max(output.data, 1)
        
        correct = np.squeeze(predicted.eq(labels.data.view_as(predicted)))
        
        for i in range(batch_size):
            label = labels.data[i]
            class_correct[label] += correct[i].item()
            class_total[label] += 1
            
    print('Test Loss: {:.6f}
    '.format(test_loss.numpy()[0]))
    
    for i in range(10):
        if class_total[i] > 0:
            print('Test Accuracy of %5s: %2d%% (%2d/%2d)' % (
                classes[i], 100 * class_correct[i] / class_total[i],
                np.sum(class_correct[i]), np.sum(class_total[i])))
        else:
            print('Test Accuracy of %5s: N/A (no training examples)' % (classes[i]))
    
            
    print('
    Test Accuracy (Overall): %2d%% (%2d/%2d)' % (
        100. * np.sum(class_correct) / np.sum(class_total),
        np.sum(class_correct), np.sum(class_total)))
    
    Test Loss: 2.362950
    
    Test Accuracy of T-shirt/top: 85% (850/1000)
    Test Accuracy of Trouser: 96% (963/1000)
    Test Accuracy of Pullover: 84% (842/1000)
    Test Accuracy of Dress: 91% (911/1000)
    Test Accuracy of  Coat: 85% (856/1000)
    Test Accuracy of Sandal: 98% (989/1000)
    Test Accuracy of Shirt: 49% (495/1000)
    Test Accuracy of Sneaker: 94% (948/1000)
    Test Accuracy of   Bag: 97% (978/1000)
    Test Accuracy of Ankle boot: 93% (930/1000)
    
    Test Accuracy (Overall): 87% (8762/10000)
    

    2.3 特征可视化

    模型得到训练并且在测试数据上可以达到 87% 的准确率,下面让我们进行可视化。

    可视化策略是从模型中将各卷积层的参数提取出来,作为独立的过滤器,使用 OpenCV 的 filter2D 函数,施加在一张从测试集抽样出的图像中。观察过滤器对图像起到的作用,并尝试去解释当前过滤器对原图起到了怎样的滤波作用。

    从数据集中抽取单张图片

    dataiter = iter(test_loader)
    images, labels = dataiter.next()
    images = images.numpy()
    
    idx = 15
    img = np.squeeze(images[idx])
    
    import cv2
    plt.imshow(img, cmap='gray')
    
    <matplotlib.image.AxesImage at 0x124832a90>
    

    进行第一层卷积核的可视化

    weights = net.conv1.weight.data
    w = weights.numpy()
    print(w.shape)
    
    fig = plt.figure(figsize=(30, 10))
    columns = 4 * 2
    row = 4
    for i in range(0, columns * row):
        fig.add_subplot(row, columns, i+1)
        if ((i%2)==0):
            plt.imshow(w[int(i/2)][0], cmap='gray')
        else:
            c = cv2.filter2D(img, -1, w[int((i-1)/2)][0])
            plt.imshow(c, cmap='gray')
    plt.show()
    
    (16, 1, 3, 3)
    

    进行第一层卷积核的可视化

    weights = net.conv2.weight.data
    w = weights.numpy()
    print(w.shape)
    
    fig = plt.figure(figsize=(30, 20))
    columns = 4 * 2
    row = 8
    for i in range(0, columns * row):
        fig.add_subplot(row, columns, i+1)
        if ((i%2)==0):
            plt.imshow(w[int(i/2)][0], cmap='gray')
        else:
            c = cv2.filter2D(img, -1, w[int((i-1)/2)][0])
            plt.imshow(c, cmap='gray')
    plt.show()
    
    (32, 16, 3, 3)
    

    可以看到一些卷积核起到了边缘检测的功能,不同的卷积核对不同方向,不同的纹理,或者说不同的图像内容敏感。

    感觉这种人以主观想法可视化卷积的方法还不够丰满,可能这就算是简单的神经网络的可视化方法。除了卷积核的可视化,还可以进行全连接层的可视化。

    关于全连接层的可视化,有教程表示是通过可视化类似类别间不同数据单例的“嵌入向量”距离进行可视化的,可能还需要对全连接层产生的“嵌入向量”进行 T-SNE 将为后再进行可视化。如果后续遇到了相关内容,会在本文中再补上。

    后记

    本文内容参考自 Udacity 计算机视觉纳米学位练习,官方源码连接:

    https://github.com/udacity/CVND_Exercises/tree/master/1_5_CNN_Layers

  • 相关阅读:
    python -django 之第三方支付
    python 的排名,已经python的简单介绍
    第三方登录
    linux 基础命令
    JWT 加密
    Docker 简介
    中文分词库:结巴分词
    Django websocket 长连接使用
    jQuery截取字符串的几种方式
    Python 操作redis
  • 原文地址:https://www.cnblogs.com/alexme/p/11366792.html
Copyright © 2011-2022 走看看