zoukankan      html  css  js  c++  java
  • 【colab pytorch】训练和测试常用模板代码

    目录:

    1. 分类模型训练代码
    2. 分类模型测试代码
    3. 自定义损失函数
    4. 标签平滑
    5. mixup训练
    6. L1正则化
    7. 不对偏置项进行权重衰减
    8. 梯度裁剪
    9. 得到当前学习率
    10. 学习率衰减
    11. 优化器链式更新
    12. 模型训练可视化
    13. 保存和加载断点
    14. 提取Imagenet预训练模型的某层特征
    15. 提取imagenet预训练模型的多层特征
    16. 微调全连接层
    17. 以较大学习率微调全连接层,较小学习率微调卷积层

    1、分类模型训练代码

    # Loss and optimizer
    criterion = nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
    
    # Train the model
    total_step = len(train_loader)
    for epoch in range(num_epochs):
        for i ,(images, labels) in enumerate(train_loader):
            images = images.to(device)
            labels = labels.to(device)
    
            # Forward pass
            outputs = model(images)
            loss = criterion(outputs, labels)
    
            # Backward and optimizer
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
    
            if (i+1) % 100 == 0:
                print('Epoch: [{}/{}], Step: [{}/{}], Loss: {}'
                      .format(epoch+1, num_epochs, i+1, total_step, loss.item()))

    2、分类模型测试代码

    # Test the model
    model.eval()  # eval mode(batch norm uses moving mean/variance 
                  #instead of mini-batch mean/variance)
    with torch.no_grad():
        correct = 0
        total = 0
        for images, labels in test_loader:
            images = images.to(device)
            labels = labels.to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    
        print('Test accuracy of the model on the 10000 test images: {} %'
              .format(100 * correct / total))

    3、自定义损失函数

    继承torch.nn.Module类写自己的loss。

    class MyLoss(torch.nn.Module):
        def __init__(self):
            super(MyLoss, self).__init__()
    
        def forward(self, x, y):
            loss = torch.mean((x - y) ** 2)
            return loss

    4、标签平滑

    写一个label_smoothing.py的文件,然后在训练代码里引用,用LSR代替交叉熵损失即可。label_smoothing.py内容如下:

    import torch
    import torch.nn as nn
    
    class LSR(nn.Module):
    
        def __init__(self, e=0.1, reduction='mean'):
            super().__init__()
    
            self.log_softmax = nn.LogSoftmax(dim=1)
            self.e = e
            self.reduction = reduction
    
        def _one_hot(self, labels, classes, value=1):
            """
                Convert labels to one hot vectors
    
            Args:
                labels: torch tensor in format [label1, label2, label3, ...]
                classes: int, number of classes
                value: label value in one hot vector, default to 1
    
            Returns:
                return one hot format labels in shape [batchsize, classes]
            """
    
            one_hot = torch.zeros(labels.size(0), classes)
    
            #labels and value_added  size must match
            labels = labels.view(labels.size(0), -1)
            value_added = torch.Tensor(labels.size(0), 1).fill_(value)
    
            value_added = value_added.to(labels.device)
            one_hot = one_hot.to(labels.device)
    
            one_hot.scatter_add_(1, labels, value_added)
    
            return one_hot
    
        def _smooth_label(self, target, length, smooth_factor):
            """convert targets to one-hot format, and smooth
            them.
            Args:
                target: target in form with [label1, label2, label_batchsize]
                length: length of one-hot format(number of classes)
                smooth_factor: smooth factor for label smooth
    
            Returns:
                smoothed labels in one hot format
            """
            one_hot = self._one_hot(target, length, value=1 - smooth_factor)
            one_hot += smooth_factor / (length - 1)
    
            return one_hot.to(target.device)
    
        def forward(self, x, target):
    
            if x.size(0) != target.size(0):
                raise ValueError('Expected input batchsize ({}) to match target batch_size({})'
                        .format(x.size(0), target.size(0)))
    
            if x.dim() < 2:
                raise ValueError('Expected input tensor to have least 2 dimensions(got {})'
                        .format(x.size(0)))
    
            if x.dim() != 2:
                raise ValueError('Only 2 dimension tensor are implemented, (got {})'
                        .format(x.size()))
    
            smoothed_target = self._smooth_label(target, x.size(1), self.e)
            x = self.log_softmax(x)
            loss = torch.sum(- x * smoothed_target, dim=1)
    
            if self.reduction == 'none':
                return loss
    
            elif self.reduction == 'sum':
                return torch.sum(loss)
    
            elif self.reduction == 'mean':
                return torch.mean(loss)
    
            else:
                raise ValueError('unrecognized option, expect reduction to be one of none, mean, sum')

    或者直接在训练文件里做label smoothing

    for images, labels in train_loader:
        images, labels = images.cuda(), labels.cuda()
        N = labels.size(0)
        # C is the number of classes.
        smoothed_labels = torch.full(size=(N, C), fill_value=0.1 / (C - 1)).cuda()
        smoothed_labels.scatter_(dim=1, index=torch.unsqueeze(labels, dim=1), value=0.9)
    
        score = model(images)
        log_prob = torch.nn.functional.log_softmax(score, dim=1)
        loss = -torch.sum(log_prob * smoothed_labels) / N
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    5、mixup训练

    beta_distribution = torch.distributions.beta.Beta(alpha, alpha)
    for images, labels in train_loader:
        images, labels = images.cuda(), labels.cuda()
    
        # Mixup images and labels.
        lambda_ = beta_distribution.sample([]).item()
        index = torch.randperm(images.size(0)).cuda()
        mixed_images = lambda_ * images + (1 - lambda_) * images[index, :]
        label_a, label_b = labels, labels[index]
    
        # Mixup loss.
        scores = model(mixed_images)
        loss = (lambda_ * loss_function(scores, label_a)
                + (1 - lambda_) * loss_function(scores, label_b))
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    6、L1正则化

    l1_regularization = torch.nn.L1Loss(reduction='sum')
    loss = ...  # Standard cross-entropy loss
    for param in model.parameters():
        loss += torch.sum(torch.abs(param))
    loss.backward()

    7、不对偏置进行权重衰减

    pytorch里的weight decay相当于l2正则

    bias_list = (param for name, param in model.named_parameters() if name[-4:] == 'bias')
    others_list = (param for name, param in model.named_parameters() if name[-4:] != 'bias')
    parameters = [{'parameters': bias_list, 'weight_decay': 0},                
                  {'parameters': others_list}]
    optimizer = torch.optim.SGD(parameters, lr=1e-2, momentum=0.9, weight_decay=1e-4)

    8、梯度裁剪

    torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=20)

    9、得到当前学习率

    # If there is one global learning rate (which is the common case).
    lr = next(iter(optimizer.param_groups))['lr']
    
    # If there are multiple learning rates for different layers.
    all_lr = []
    for param_group in optimizer.param_groups:
        all_lr.append(param_group['lr'])

    另一种方法,在一个batch训练代码里,当前的lr是optimizer.param_groups[0]['lr']

    10、学习率衰减

    # Reduce learning rate when validation accuarcy plateau.
    scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='max', patience=5, verbose=True)
    for t in range(0, 80):
        train(...)
        val(...)
        scheduler.step(val_acc)
    
    # Cosine annealing learning rate.
    scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=80)
    # Reduce learning rate by 10 at given epochs.
    scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[50, 70], gamma=0.1)
    for t in range(0, 80):
        scheduler.step()    
        train(...)
        val(...)
    
    # Learning rate warmup by 10 epochs.
    scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda=lambda t: t / 10)
    for t in range(0, 10):
        scheduler.step()
        train(...)
        val(...)

    11、优化器链式更新

    从1.4版本开始,torch.optim.lr_scheduler 支持链式更新(chaining),即用户可以定义两个 schedulers,并交替在训练中使用。

    import torch
    from torch.optim import SGD
    from torch.optim.lr_scheduler import ExponentialLR, StepLR
    model = [torch.nn.Parameter(torch.randn(2, 2, requires_grad=True))]
    optimizer = SGD(model, 0.1)
    scheduler1 = ExponentialLR(optimizer, gamma=0.9)
    scheduler2 = StepLR(optimizer, step_size=3, gamma=0.1)
    for epoch in range(4):
        print(epoch, scheduler2.get_last_lr()[0])
        optimizer.step()
        scheduler1.step()
        scheduler2.step()

    12、模型训练可视化

    pip install tensorboard

    tensorboard --logdir=runs

    使用SummaryWriter类来收集和可视化相应的数据,放了方便查看,可以使用不同的文件夹,比如'Loss/train'和'Loss/test'。

    from torch.utils.tensorboard import SummaryWriter
    import numpy as np
    
    writer = SummaryWriter()
    
    for n_iter in range(100):
        writer.add_scalar('Loss/train', np.random.random(), n_iter)
        writer.add_scalar('Loss/test', np.random.random(), n_iter)
        writer.add_scalar('Accuracy/train', np.random.random(), n_iter)
        writer.add_scalar('Accuracy/test', np.random.random(), n_iter)

    13、保存和加载断点

    tart_epoch = 0
    # Load checkpoint.
    if resume: # resume为参数,第一次训练时设为0,中断再训练时设为1
        model_path = os.path.join('model', 'best_checkpoint.pth.tar')
        assert os.path.isfile(model_path)
        checkpoint = torch.load(model_path)
        best_acc = checkpoint['best_acc']
        start_epoch = checkpoint['epoch']
        model.load_state_dict(checkpoint['model'])
        optimizer.load_state_dict(checkpoint['optimizer'])
        print('Load checkpoint at epoch {}.'.format(start_epoch))
        print('Best accuracy so far {}.'.format(best_acc))
    
    # Train the model
    for epoch in range(start_epoch, num_epochs): 
        ... 
    
        # Test the model
        ...
    
        # save checkpoint
        is_best = current_acc > best_acc
        best_acc = max(current_acc, best_acc)
        checkpoint = {
            'best_acc': best_acc,
            'epoch': epoch + 1,
            'model': model.state_dict(),
            'optimizer': optimizer.state_dict(),
        }
        model_path = os.path.join('model', 'checkpoint.pth.tar')
        best_model_path = os.path.join('model', 'best_checkpoint.pth.tar')
        torch.save(checkpoint, model_path)
        if is_best:
            shutil.copy(model_path, best_model_path)

    14、提取Imagenet预训练模型某层的特征

    # VGG-16 relu5-3 feature.
    model = torchvision.models.vgg16(pretrained=True).features[:-1]
    # VGG-16 pool5 feature.
    model = torchvision.models.vgg16(pretrained=True).features
    # VGG-16 fc7 feature.
    model = torchvision.models.vgg16(pretrained=True)
    model.classifier = torch.nn.Sequential(*list(model.classifier.children())[:-3])
    # ResNet GAP feature.
    model = torchvision.models.resnet18(pretrained=True)
    model = torch.nn.Sequential(collections.OrderedDict(
        list(model.named_children())[:-1]))
    
    with torch.no_grad():
        model.eval()
        conv_representation = model(image)

    15、提取imagenet预训练模型多层卷积特征

    class FeatureExtractor(torch.nn.Module):
        """Helper class to extract several convolution features from the given
        pre-trained model.
    
        Attributes:
            _model, torch.nn.Module.
            _layers_to_extract, list<str> or set<str>
    
        Example:
            >>> model = torchvision.models.resnet152(pretrained=True)
            >>> model = torch.nn.Sequential(collections.OrderedDict(
                    list(model.named_children())[:-1]))
            >>> conv_representation = FeatureExtractor(
                    pretrained_model=model,
                    layers_to_extract={'layer1', 'layer2', 'layer3', 'layer4'})(image)
        """
        def __init__(self, pretrained_model, layers_to_extract):
            torch.nn.Module.__init__(self)
            self._model = pretrained_model
            self._model.eval()
            self._layers_to_extract = set(layers_to_extract)
    
        def forward(self, x):
            with torch.no_grad():
                conv_representation = []
                for name, layer in self._model.named_children():
                    x = layer(x)
                    if name in self._layers_to_extract:
                        conv_representation.append(x)
                return conv_representation

    16、微调全连接层

    model = torchvision.models.resnet18(pretrained=True)
    for param in model.parameters():
        param.requires_grad = False
    model.fc = nn.Linear(512, 100)  # Replace the last fc layer
    optimizer = torch.optim.SGD(model.fc.parameters(), lr=1e-2, momentum=0.9, weight_decay=1e-4

    17、以较大学习率微调全连接层,较小学习率微调卷积层

    model = torchvision.models.resnet18(pretrained=True)
    finetuned_parameters = list(map(id, model.fc.parameters()))
    conv_parameters = (p for p in model.parameters() if id(p) not in finetuned_parameters)
    parameters = [{'params': conv_parameters, 'lr': 1e-3}, 
                  {'params': model.fc.parameters()}]
    optimizer = torch.optim.SGD(parameters, lr=1e-2, momentum=0.9, weight_decay=1e-4)

    摘自:http://bbs.cvmart.net/topics/1472?from=timeline

  • 相关阅读:
    25个Apache性能优化技巧推荐 新风宇宙
    九个PHP很有用的功能 新风宇宙
    ubuntu nginx的安装 新风宇宙
    .net 下对winapi的调用
    jquery选中单选框、复选框、下拉框
    中国标准书号校验码的计算方式(附C#代码)
    NET中创建一个线程有几种方法
    ASP.NET中Cookie编程的基础知识
    js日期时间函数(经典+完善+实用)
    SQL语句大全
  • 原文地址:https://www.cnblogs.com/xiximayou/p/12506033.html
Copyright © 2011-2022 走看看