zoukankan      html  css  js  c++  java
  • u版yolov3详解 --->> 网络部分

    网络部分
    网络在/models/yolov3.yaml里面定义,如下:

    # parameters
    nc: 80  # number of classes
    depth_multiple: 1.0  # model depth multiple
    width_multiple: 1.0  # layer channel multiple
    
    # anchors
    anchors:
      - [10,13, 16,30, 33,23]  # P3/8
      - [30,61, 62,45, 59,119]  # P4/16
      - [116,90, 156,198, 373,326]  # P5/32
    
    # darknet53 backbone
    backbone:
      # [from, number, module, args]
      [[-1, 1, Conv, [32, 3, 1]],  # 0                   ## 0 
       [-1, 1, Conv, [64, 3, 2]],  # 1-P1/2              ## 1 
       [-1, 1, Bottleneck, [64]],                        ## 2
       [-1, 1, Conv, [128, 3, 2]],  # 3-P2/4             ## 3 
       [-1, 2, Bottleneck, [128]],                       ## 4 
       [-1, 1, Conv, [256, 3, 2]],  # 5-P3/8             ## 5 
       [-1, 8, Bottleneck, [256]],                       ## 6 
       [-1, 1, Conv, [512, 3, 2]],  # 7-P4/16            ## 7 
       [-1, 8, Bottleneck, [512]],                       ## 8 
       [-1, 1, Conv, [1024, 3, 2]],  # 9-P5/32           ## 9 
       [-1, 4, Bottleneck, [1024]],  # 10                ## 10 
      ]
    
    # YOLOv3 head
    head:
      [[-1, 1, Bottleneck, [1024, False]],               ## 11
       [-1, 1, Conv, [512, [1, 1]]],
       [-1, 1, Conv, [1024, 3, 1]],
       [-1, 1, Conv, [512, 1, 1]],
       [-1, 1, Conv, [1024, 3, 1]],  # 15 (P5/32-large)
    
       [-2, 1, Conv, [256, 1, 1]],                              ## 16
       [-1, 1, nn.Upsample, [None, 2, 'nearest']],               ## 17
       [[-1, 8], 1, Concat, [1]],  # cat backbone P4             ## 18
       [-1, 1, Bottleneck, [512, False]],                        ## 19
       [-1, 1, Bottleneck, [512, False]],                         ## 20
       [-1, 1, Conv, [256, 1, 1]],                                 ## 21
       [-1, 1, Conv, [512, 3, 1]],  # 22 (P4/16-medium)              ## 22
    
       [-2, 1, Conv, [128, 1, 1]],                                 ## 23
       [-1, 1, nn.Upsample, [None, 2, 'nearest']],                 ## 24
       [[-1, 6], 1, Concat, [1]],  # cat backbone P3                ## 25
       [-1, 1, Bottleneck, [256, False]],                          ## 26
       [-1, 2, Bottleneck, [256, False]],  # 27 (P3/8-small)      ## 27
    
       [[27, 22, 15], 1, Detect, [nc, anchors]],   # Detect(P3, P4, P5)
      ]
    

    一开始看一头雾水,然后耐下心结合代码看还是很清晰的。
    要注意# [from, number, module, args]。
    from是从哪里接,-1就是代表上一层,-2就是上上层,具体数字就是具体哪一层。
    层数就是我后面注释的##部分数字,就是从0排下来的。
    number就是重复来几次,8, Bottleneck就是重复8次Bottleneck,和resnet里面的残差类似。
    args就是module的参数。
    解析yolov3.yaml代码如下:

    def parse_model(d, ch):  # model_dict, input_channels(3)
        logger.info('
    %3s%18s%3s%10s  %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))
        anchors, nc, gd, gw = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple']
        na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors  # number of anchors
        no = na * (nc + 5)  # number of outputs = anchors * (classes + 5)
    
        layers, save, c2 = [], [], ch[-1]  # layers, savelist, ch out
        # tmp_1 = d['backbone'] + d['head']
        for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']):  # from, number, module, args
            m = eval(m) if isinstance(m, str) else m  # eval strings
            for j, a in enumerate(args):
                try:
                    args[j] = eval(a) if isinstance(a, str) else a  # eval strings
                except:
                    pass
    
            n = max(round(n * gd), 1) if n > 1 else n  # depth gain
            if m in [Conv, GhostConv, Bottleneck, GhostBottleneck, SPP, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP,
                     C3, C3TR]:
                c1, c2 = ch[f], args[0]
                if c2 != no:  # if not output
                    c2 = make_divisible(c2 * gw, 8)
    
                args = [c1, c2, *args[1:]]
                if m in [BottleneckCSP, C3, C3TR]:
                    args.insert(2, n)  # number of repeats
                    n = 1
            elif m is nn.BatchNorm2d:
                args = [ch[f]]
            elif m is Concat:
                c2 = sum([ch[x] for x in f])
            elif m is Detect:
                args.append([ch[x] for x in f])
                if isinstance(args[1], int):  # number of anchors
                    args[1] = [list(range(args[1] * 2))] * len(f)
            elif m is Contract:
                c2 = ch[f] * args[0] ** 2
            elif m is Expand:
                c2 = ch[f] // args[0] ** 2
            else:
                c2 = ch[f]
            m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args)  # module
            t = str(m)[8:-2].replace('__main__.', '')  # module type
            np = sum([x.numel() for x in m_.parameters()])  # number params
            m_.i, m_.f, m_.type, m_.np = i, f, t, np  # attach index, 'from' index, type, number params
            logger.info('%3s%18s%3s%10.0f  %-40s%-30s' % (i, f, n, np, t, args))  # print
            save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1)  # append to savelist
            # if len(save) != 0:
            #     ii = i
            #     tmp = -2 % i   #  -2 % 16 =14
            #     aa = 0
            layers.append(m_)
            if i == 0:
                ch = []
            ch.append(c2)
        return nn.Sequential(*layers), sorted(save)
    

    save保存了需要保存的feature map的序号。
    https://blog.csdn.net/dz4543/article/details/90049377

    上面这张图大体显示了yolov3的网络,只不过它输入是256大小的。我列出了640大小的数据流表格如下:

            for i in range(self.nl):
                x[i] = self.m[i](x[i])  # conv
                bs, _, ny, nx = x[i].shape  # x(bs,255,20,20) to x(bs,3,20,20,85)
                x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
    

    [bs,255,80,80] ---> [bs,3,85,80,80] ---> [bs,3,80,80,85]

    [bs,255,40,40] ---> [bs,3,85,40,40] ---> [bs,3,40,40,85]

    [bs,255,20,20] ---> [bs,3,85,20,20] ---> [bs,3,20,20,85]

    in_num out_num k s out_shape
    input
    backbone 0 Conv 3 32
    1 Conv 32 64
    2 Bottleneck(×1) 64 64
    3 Conv 64 128
    4 Bottleneck(×2) 128 128
    5 Conv 128 256
    6 Bottleneck(×8) 256 256
    7 Conv 256 512
    8 Bottleneck(×8) 512 512
    9 Conv 512 1024
    10 Bottleneck(×4) 1024 1024
    head 11 Bottleneck(×1) 1024 1024
    12 Conv 1024 512
    13 Conv 512 1024
    14 Conv 1024 512
    15 Conv 512 1024
    head 16 [-2]Conv 512 256
    17 nn.Upsample 256 256
    18 [-1,8]Concat [256,40,40] + [512,40,40]
    19 Bottleneck(×1) 768 512
    20 Bottleneck(×1) 512 512
    21 Conv 512 256
    22 Conv 265 512
    head 23 [-2]Conv 256 128
    24 nn.Upsample 128 128
    25 [-1,6]Concat [128,80,80] + [256,80,80]
    26 Bottleneck(×1) 384 256
    27 Bottleneck(×2) 256 256
    Detect 28 [27]Conv 256 255
    [22]Conv 512 255
    [15]Conv 1024 255

    网上还有一个网络图,随便看看:

    好记性不如烂键盘---点滴、积累、进步!
  • 相关阅读:
    Unity物理系统随记
    Unity相机跟随小结
    unity制作赛车游戏
    动态编程
    C#-特性,反射,动态编程
    BASE64加解密
    idea快捷键
    git安装和git命令:全局设置用户名邮箱配置
    基于Node.js+MySQL开发的开源微信小程序商城(微信小程序)部署环境
    微信小程序开发入门(一),Nodejs搭建本地服务器
  • 原文地址:https://www.cnblogs.com/yanghailin/p/15309495.html
Copyright © 2011-2022 走看看