zoukankan      html  css  js  c++  java
  • 【colab pytorch】模型定义

    其实只是模型的话还是很好弄的,按照别人的模型,舔砖加瓦。

    1、简单模型实例

    class ConvNet(nn.Module):
        def __init__(self, num_classes=10):
            super(ConvNet, self).__init__()
            self.layer1 = nn.Sequential(
                nn.Conv2d(1, 16, kernel_size=5, stride=1, padding=2),
                nn.BatchNorm2d(16),
                nn.ReLU(),
                nn.MaxPool2d(kernel_size=2, stride=2))
            self.layer2 = nn.Sequential(
                nn.Conv2d(16, 32, kernel_size=5, stride=1, padding=2),
                nn.BatchNorm2d(32),
                nn.ReLU(),
                nn.MaxPool2d(kernel_size=2, stride=2))
            self.fc = nn.Linear(7*7*32, num_classes)
    
        def forward(self, x):
            out = self.layer1(x)
            out = self.layer2(out)
            out = out.reshape(out.size(0), -1)
            out = self.fc(out)
            return out
    
    model = ConvNet(num_classes).to(device)

    像卷积、池化、激活函数、批量归一化、全连接等等,只要把握好输入的维度和输出的维度,慢慢定义就好了。

    2、双线性汇合

    X = torch.reshape(N, D, H * W)                        # Assume X has shape N*D*H*W
    X = torch.bmm(X, torch.transpose(X, 1, 2)) / (H * W)  # Bilinear pooling
    assert X.size() == (N, D, D)
    X = torch.reshape(X, (N, D * D))
    X = torch.sign(X) * torch.sqrt(torch.abs(X) + 1e-5)   # Signed-sqrt normalization
    X = torch.nn.functional.normalize(X)                  # L2 normalization

    3、多卡同步BN

    当使用 torch.nn.DataParallel 将代码运行在多张 GPU 卡上时,PyTorch 的 BN 层默认操作是各卡上数据独立地计算均值和标准差,同步 BN 使用所有卡上的数据一起计算 BN 层的均值和标准差,缓解了当批量大小(batch size)比较小时对均值和标准差估计不准的情况,是在目标检测等任务中一个有效的提升性能的技巧。

    sync_bn = torch.nn.SyncBatchNorm(num_features, eps=1e-05, momentum=0.1, affine=True, 
                                     track_running_stats=True)

    将所有网络的BN层改为同步BN层:

    def convertBNtoSyncBN(module, process_group=None):
        '''Recursively replace all BN layers to SyncBN layer.
    
        Args:
            module[torch.nn.Module]. Network
        '''
        if isinstance(module, torch.nn.modules.batchnorm._BatchNorm):
            sync_bn = torch.nn.SyncBatchNorm(module.num_features, module.eps, module.momentum, 
                                             module.affine, module.track_running_stats, process_group)
            sync_bn.running_mean = module.running_mean
            sync_bn.running_var = module.running_var
            if module.affine:
                sync_bn.weight = module.weight.clone().detach()
                sync_bn.bias = module.bias.clone().detach()
            return sync_bn
        else:
            for name, child_module in module.named_children():
                setattr(module, name) = convert_syncbn_model(child_module, process_group=process_group))
            return module

    类似BN滑动平均

    class BN(torch.nn.Module)
        def __init__(self):
            ...
            self.register_buffer('running_mean', torch.zeros(num_features))
    
        def forward(self, X):
            ...
            self.running_mean += momentum * (current - self.running_mean)

    4、计算模型整体参数量

    num_parameters = sum(torch.numel(parameter) for parameter in model.parameters())

    5、模型权重初始化

    注意 model.modules() 和 model.children() 的区别:model.modules() 会迭代地遍历模型的所有子层,而 model.children() 只会遍历模型下的一层。

    # Common practise for initialization.
    for layer in model.modules():
        if isinstance(layer, torch.nn.Conv2d):
            torch.nn.init.kaiming_normal_(layer.weight, mode='fan_out',
                                          nonlinearity='relu')
            if layer.bias is not None:
                torch.nn.init.constant_(layer.bias, val=0.0)
        elif isinstance(layer, torch.nn.BatchNorm2d):
            torch.nn.init.constant_(layer.weight, val=1.0)
            torch.nn.init.constant_(layer.bias, val=0.0)
        elif isinstance(layer, torch.nn.Linear):
            torch.nn.init.xavier_normal_(layer.weight)
            if layer.bias is not None:
                torch.nn.init.constant_(layer.bias, val=0.0)
    
    # Initialization with given tensor.
    layer.weight = torch.nn.Parameter(tensor)

    6、提取模型中的某一层

    modules()会返回模型中所有模块的迭代器,它能够访问到最内层,比如self.layer1.conv1这个模块,还有一个与它们相对应的是name_children()属性以及named_modules(),这两个不仅会返回模块的迭代器,还会返回网络层的名字。

    # 取模型中的前两层
    new_model = nn.Sequential(*list(model.children())[:2] 
    # 如果希望提取出模型中的所有卷积层,可以像下面这样操作:
    for layer in model.named_modules():
        if isinstance(layer[1],nn.Conv2d):
             conv_model.add_module(layer[0],layer[1])

    7、部分层使用预训练模型

    注意如果保存的模型是 torch.nn.DataParallel,则当前的模型也需要是

    model.load_state_dict(torch.load('model.pth'), strict=False)

    8、将GPU保存的模型加载到CPU

    model.load_state_dict(torch.load('model.pth', map_location='cpu'))
  • 相关阅读:
    WordPress ProPlayer插件‘id’参数SQL注入漏洞
    WordPress Spider Catalog插件多个SQL注入和跨站脚本漏洞
    Apache Struts2 includeParams属性远程命令执行漏洞(CVE20131966)
    Linux kernel perf_events local root exploit
    Apache Struts ‘ParameterInterceptor’类OGNL安全绕过漏洞
    发现 解决 分享
    契约值多少钱?
    当阳光洒在脸上
    火车上的摘抄
    流浪
  • 原文地址:https://www.cnblogs.com/xiximayou/p/12437191.html
Copyright © 2011-2022 走看看