zoukankan      html  css  js  c++  java
  • 迁移学习

    在这一教程中,你将会学习到怎么使用迁移学习训练网络。你可以在cs231n课程中学习更多有关迁移学习的内容。

    引用如下笔记:

      实践中,很少有人从随机开始训练一个完整的网络,因为缺乏足够的数据。通用的做法是在一个非常大的数据集上(比如ImageNet,它有120万图片,1000个类别)预训练一个ConvNet,然后使用这个ConvNet作为初始化或一个固定的特征提取器用在你感兴趣的任务上。

     有如下两种主要的迁移学习形式:

    • 微调ConvNet:相对于随机初始化,我们使用一个预训练网络参数来初始化网络,剩下的与常规的相同。
    • ConvNet作为一个固定的特征提取器:除了最后的全连接层,我们将会锁住所有的网络权重。最后的全连接层替换成一个新的随机初始化的层,只有这层被训练。

     导入需要的库:

    import torch
    import torch.nn as nn
    import torch.optim as optim
    from torch.optim import lr_scheduler
    import torchvision
    from torchvision import datasets,models,transforms
    import matplotlib.pyplot as plt
    import time
    import os
    import copy
    
    plt.ion()

    加载数据:

    我们将使用torchvision和torch.utils.data包来加载数据。

    我们今天想要解决的问题是训练一个模型来分类蚂蚁和蜜蜂。我们有将近120张关于蚂蚁和蜜蜂的训练图片。对于每个类有75张验证图片。通常这是一个非常小的数据集,如果从随机开始训练不足以泛化。使用迁移学习,能够相对泛化地很好。

    这个数据集是imgnet的一个小的子集。

    注意:点击这里下载,并提取到当下文件夹。

    data_transforms={
        'train':transforms.Compose([
            transforms.RandomResizedCrop(224),
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize([0.485,0.456,0.406],[0.229,0.224,0.225])
        ]),
        'val':transforms.Compose([
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            transforms.Normalize([0.485,0.456,0.406],[0.229,0.224,0.225])
        ])
    }
    data_dir='hymenoptera_data'
    image_datasets={x:datasets.ImageFolder(os.path.join(data_dir,x),data_transforms[x])
                   for x in ['train','val']}
    dataloaders
    ={x:torch.utils.data.DataLoader(image_datasets[x],batch_size=4,shuffle=True,num_workers=0) for x in ['train','val']}
    dataset_size
    ={x:len(image_datasets[x]) for x in ['train','val']} class_name=image_datasets['train'].classes device=torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

     可视化部分图像

    让我们可视化一些训练图片来了解下数据。

    import numpy as np
    def imshow(inp,title=None):
        """Imshow for Tensor"""
        inp=inp.numpy().transpose((1,2,0))
        mean=np.array([0.485,0.456,0.406])
        std=np.array([0.229,0.224,0.225])
       #输出*标准差+均值 inp
    =std*inp+mean
    #限制范围为[0,1] inp
    =np.clip(inp,0,1) plt.imshow(inp) if title is not None: plt.title(title) plt.pause(0.001) inputs,classes=next(iter(dataloaders['train'])) out=torchvision.utils.make_grid(inputs) imshow(out,title=[class_name[x] for x in classes])

    训练模型

    现在,让我们写一个常规的函数来训练模型,我们将会实现:

    • 调整学习率
    • 保存最好的模型

    在下面,参数scheduler是来自torch.optim.lr_scheduler的LR调度目标

    def train_model(model,criterion,optimizer,scheduler,num_epoch=25):
        since=time.time()
        
        # 最佳的模型参数
        best_model_wts=copy.deepcopy(model.state_dict())
        best_acc=0.0
        
        for epoch in range(num_epoch):
            print('Epoch {}/{}'.format(epoch,num_epoch-1))
            print('-'*10)
            
            # 在每个epoch划分训练与验证集合
            for phase in ['train','val']:
                if phase=='train':
                    scheduler.step()
                    model.train()
                else:
                    model.eval()
                
                running_loss=0.0
                running_corrects=0
                
                #迭代数据
                for inputs,labels in dataloaders[phase]:
                    inputs=inputs.to(device)
                    labels=labels.to(device)
                    
                    # 清空梯度
                    optimizer.zero_grad()
                    
                    # 前向传播
                    # 在训练中追踪历史
                    with torch.set_grad_enabled(phase=='train'):
                        outputs=model(inputs)
                        _,preds=torch.max(outputs,1)
                        loss=criterion(outputs,labels)
                        
                        #在训练中后向传播并优化
                        if phase=='train':
                            loss.backward()
                            optimizer.step()
                    
                        running_loss+=loss.item()*inputs.size(0)
                        running_corrects+=torch.sum(preds==labels.data)
                        
                epoch_loss=running_loss/dataset_sizes[phase]
                epoch_acc=running_corrects.double()/dataset_sizes[phase]
                
                print('{} loss:{:.4f} Acc:{:.4f}'.format(phase,epoch_loss,epoch_acc))
                
                #对模型深度复制
                if phase=='val' and epoch_acc > best_acc:
                    best_acc=epoch_acc
                    best_model_wts=copy.deepcopy(model.state_dict())
                
            print()
        
        time_elapsed=time.time()-since
        print('Traning complete in {:.0f}m {:.0f}s'.format(time_elapsed//60,time_elapsed%60))
        print('Best val Acc: {:4f}'.format(best_acc))
        
        # 加载最佳的模型参数
        model.load_state_dict(best_model_wts)
        return model  

    可视化模型预测

    常规的函数来展示部分图片的预测

    def visualize_model(model,num_images=6):
        was_training=model.training
        model.eval()
        images_so_far=0
        fig=plt.figure()
        
        with torch.no_grad():
            for i,(inputs,labels) in enumerate(dataloaders['val']):
                inputs=inputs.to(device)
                lables=labels.to(device)
                
                outputs=model(inputs)
                _,preds=torch.max(outputs,1)
                
                for j in range(inputs.size()[0]):
                    images_so_far+=1
                    #被分为num_images//2行,2列,显示位置从1开始
                    ax =plt.subplot(num_images//2,2,images_so_far)
                    ax.axis('off')
                    ax.set_title('predicted:{}'.format(class_names[preds[j]]))
                    imshow(inputs.cpu.data[j])
                    
                    if images_so_far==num_images:
                        model.train(mode=was_training)
                        return
                    
            # 在测试之后将模型恢复之前的形式        
            model.train(mode=was_trainning)

    微调卷积网络

    加载预训练模型和替换最后的全连接层

    model_ft=models.resnet18(pretrained=True)
    num_ftrs=model_ft.fc.in_features
    model_ft.fc=nn.Linear(num_ftrs,2)
    
    model_ft=model_ft.to(device)
    
    criterion=nn.CrossEntropyLoss()
    
    # 这样所有的参数都将优化
    optimizer_ft=optim.SGD(model_ft.parameters(),lr=0.001,momentum=0.9)
    
    # 每7个epoch LR衰减0.1
    exp_lr_scheduler=lr_scheduler.StepLR(optimizer_ft,step_size=7,gamma=0.1)

    训练和评估

    model_ft=train_model(model_ft,criterion,optimizer_ft,exp_lr_scheduler,num_epoch=25)
    out:
    Epoch 0/24 ----------
    train loss:0.5504 Acc:0.7582
    val loss:0.2265 Acc:0.9085
    Epoch 1/24 ----------
    train loss:0.5732 Acc:0.7377
    val loss:0.3290 Acc:0.8693
    Epoch 2/24 ----------
    train loss:0.4693 Acc:0.7664
    val loss:0.3554 Acc:0.8824
    Epoch 3/24 ----------
    train loss:0.5210 Acc:0.8402
    val loss:0.5970 Acc:0.7843
    Epoch 4/24 ----------
    train loss:0.4709 Acc:0.8361
    val loss:0.2318 Acc:0.8758
    Epoch 5/24 ----------
    train loss:0.4802 Acc:0.8115
    val loss:0.2669 Acc:0.8693
    Epoch 6/24 ----------
    train loss:0.5484 Acc:0.7910
    val loss:0.2531 Acc:0.8889
    Epoch 7/24 ----------
    train loss:0.3559 Acc:0.8443
    val loss:0.2264 Acc:0.9085
    Epoch 8/24 ----------
    train loss:0.3517 Acc:0.8689
    val loss:0.2716 Acc:0.8954
    Epoch 9/24 ----------
    train loss:0.3331 Acc:0.8607
    val loss:0.2068 Acc:0.9150
    Epoch 10/24 ----------
    train loss:0.3589 Acc:0.8402
    val loss:0.1906 Acc:0.9216
    Epoch 11/24 ----------
    train loss:0.2559 Acc:0.9057
    val loss:0.1732 Acc:0.9346
    Epoch 12/24 ----------
    train loss:0.3467 Acc:0.8279
    val loss:0.1772 Acc:0.9346
    Epoch 13/24 ----------
    train loss:0.3280 Acc:0.8730
    val loss:0.1722 Acc:0.9346
    Epoch 14/24 ----------
    train loss:0.2826 Acc:0.8975
    val loss:0.1651 Acc:0.9346
    Epoch 15/24 ----------
    train loss:0.2317 Acc:0.9098
    val loss:0.1984 Acc:0.9346
    Epoch 16/24 ----------
    train loss:0.2708 Acc:0.8811
    val loss:0.2128 Acc:0.9150
    Epoch 17/24 ----------
    train loss:0.3020 Acc:0.8811
    val loss:0.1926 Acc:0.9281
    Epoch 18/24 ----------
    train loss:0.3015 Acc:0.8689
    val loss:0.2119 Acc:0.8954
    Epoch 19/24 ----------
    train loss:0.3194 Acc:0.8525
    val loss:0.1766 Acc:0.9281
    Epoch 20/24 ----------
    train loss:0.2548 Acc:0.8893
    val loss:0.1709 Acc:0.9346
    Epoch 21/24 ----------
    train loss:0.2439 Acc:0.8893
    val loss:0.1996 Acc:0.9216
    Epoch 22/24 ----------
    train loss:0.2203 Acc:0.9262
    val loss:0.1606 Acc:0.9281
    Epoch 23/24 ----------
    train loss:0.2406 Acc:0.9057
    val loss:0.1681 Acc:0.9412
    Epoch 24/24 ----------
    train loss:0.1897 Acc:0.9385
    val loss:0.1548 Acc:0.9281
    Traning complete in 6m 39s Best val Acc: 0.941176
    visualize_model(model_ft)

     

    卷积网络作为固定的特征提取器

    现在,我们需要固定住所有的网络,除了最后一层。我们需要设置requires_grad==False来固定网络参数,以便于它们的梯度在反向传播不进行计算。

    你可以点击这里了解更多。

    model_conv=torchvision.models.resnet18(pretrained=True)
    for param in model_conv.parameters():
        param.requires_grad=False
    
    # 新加的模块默认是requires_grad=True
    num_ftrs=model_conv.fc.in_features
    model_conv.fc=nn.Linear(num_ftrs,2)
    
    model_conv=model_conv.to(device)
    criterion=nn.CrossEntropyLoss()
    
    # 只有最后一层的参数被优化
    optimizer_conv=optim.SGD(model_conv.fc.parameters(),lr=0.001,momentum=0.9)
    
    exp_lr_scheduler=lr_scheduler.StepLR(optimizer_conv,step_size=7,gamma=0.1)

     训练和评估

    model_conv=train_model(model_conv,criterion,optimizer_conv,exp_lr_scheduler,num_epoch=25)
    out:
    Epoch 0/24
    ----------
    train loss:0.6060 Acc:0.6311
    val loss:0.3218 Acc:0.8562
    
    Epoch 1/24
    ----------
    train loss:0.6493 Acc:0.7418
    val loss:0.5251 Acc:0.7712
    
    Epoch 2/24
    ----------
    train loss:0.5541 Acc:0.7623
    val loss:0.1878 Acc:0.9477
    
    Epoch 3/24
    ----------
    train loss:0.4401 Acc:0.7828
    val loss:0.3141 Acc:0.8758
    
    Epoch 4/24
    ----------
    train loss:0.5697 Acc:0.7869
    val loss:0.3362 Acc:0.8627
    
    Epoch 5/24
    ----------
    train loss:0.3668 Acc:0.8402
    val loss:0.4343 Acc:0.8301
    
    Epoch 6/24
    ----------
    train loss:0.4692 Acc:0.8238
    val loss:0.2586 Acc:0.9085
    
    Epoch 7/24
    ----------
    train loss:0.2712 Acc:0.8770
    val loss:0.1950 Acc:0.9477
    
    Epoch 8/24
    ----------
    train loss:0.3284 Acc:0.8443
    val loss:0.1944 Acc:0.9542
    
    Epoch 9/24
    ----------
    train loss:0.3115 Acc:0.8852
    val loss:0.2000 Acc:0.9542
    
    Epoch 10/24
    ----------
    train loss:0.3889 Acc:0.8402
    val loss:0.1896 Acc:0.9542
    
    Epoch 11/24
    ----------
    train loss:0.3071 Acc:0.8689
    val loss:0.1981 Acc:0.9542
    
    Epoch 12/24
    ----------
    train loss:0.2208 Acc:0.9098
    val loss:0.1956 Acc:0.9542
    
    Epoch 13/24
    ----------
    train loss:0.3622 Acc:0.8320
    val loss:0.2058 Acc:0.9477
    
    Epoch 14/24
    ----------
    train loss:0.3290 Acc:0.8525
    val loss:0.2212 Acc:0.9412
    
    Epoch 15/24
    ----------
    train loss:0.3359 Acc:0.8525
    val loss:0.2120 Acc:0.9542
    
    Epoch 16/24
    ----------
    train loss:0.3550 Acc:0.8279
    val loss:0.1864 Acc:0.9477
    
    Epoch 17/24
    ----------
    train loss:0.3395 Acc:0.8402
    val loss:0.2104 Acc:0.9542
    
    Epoch 18/24
    ----------
    train loss:0.2966 Acc:0.8811
    val loss:0.2044 Acc:0.9477
    
    Epoch 19/24
    ----------
    train loss:0.3477 Acc:0.8320
    val loss:0.1918 Acc:0.9542
    
    Epoch 20/24
    ----------
    train loss:0.3185 Acc:0.8607
    val loss:0.1891 Acc:0.9477
    
    Epoch 21/24
    ----------
    train loss:0.4269 Acc:0.8238
    val loss:0.2043 Acc:0.9412
    
    Epoch 22/24
    ----------
    train loss:0.3475 Acc:0.8566
    val loss:0.1913 Acc:0.9542
    
    Epoch 23/24
    ----------
    train loss:0.4412 Acc:0.7951
    val loss:0.1972 Acc:0.9542
    
    Epoch 24/24
    ----------
    train loss:0.3772 Acc:0.8279
    val loss:0.2329 Acc:0.9346
    
    Traning complete in 3m 2s
    Best val Acc: 0.954248
    visualize_model(model_conv)
    plt.ioff()
    plt.show()

  • 相关阅读:
    最炫数学风
    A fine property of the convective terms of axisymmetric MHD system, and a regularity criterion in terms of $om^ t$
    Regularity criteria for NSE 4: $p_3u$
    2017-2018-2 PDE 讨论班
    2017-2018-2点集拓扑
    轴对称 Navier-Stokes 方程组的点态正则性准则 II
    轴对称 Navier-Stokes 方程组的点态正则性准则 I
    二维空间中的一个向量场的散度
    word插入公式不自动斜体的解决办法
    2017-2018-2课表
  • 原文地址:https://www.cnblogs.com/Thinker-pcw/p/9675673.html
Copyright © 2011-2022 走看看