zoukankan      html  css  js  c++  java
  • pytorch实现yolov3(2) 配置文件解析及各layer生成

    配置文件

    配置文件yolov3.cfg定义了网络的结构

    ....
    
    [convolutional]
    batch_normalize=1
    filters=64
    size=3
    stride=2
    pad=1
    activation=leaky
    
    [convolutional]
    batch_normalize=1
    filters=32
    size=1
    stride=1
    pad=1
    activation=leaky
    
    [convolutional]
    batch_normalize=1
    filters=64
    size=3
    stride=1
    pad=1
    activation=leaky
    
    [shortcut]
    from=-3
    activation=linear
    
    .....
    

    配置文件描述了model的结构.

    yolov3 layer

    yolov3有以下几种结构

    • Convolutional
    • Shortcut
    • Upsample
    • Route
    • YOLO

    Convolutional

    [convolutional]
    batch_normalize=1  
    filters=64  
    size=3  
    stride=1  
    pad=1  
    activation=leaky
    

    Shortcut

    [shortcut]
    from=-3  
    activation=linear  
    

    类似于resnet,用以加深网络深度.上述配置的含义是shortcut layer的输出是前一层和前三层的输出的叠加.
    resnet skip connection解释详细见https://zhuanlan.zhihu.com/p/28124810

    Upsample

    [upsample]
    stride=2
    

    通过双线性插值法将N*N的feature map变为(stride*N) * (stride*N)的feature map.模仿特征金字塔,生成多尺度feature map.加强小目标检测效果.

    Route

    [route]
    layers = -4
    
    [route]
    layers = -1, 61
    

    以上述配置为例:
    当layers只有一个值,代表route layer输出的是router layer - 4那一层layer的feature map.
    当layers有2个值时,代表route layer的输出为route layer -1和第61 layer的feature map在深度方向连接起来.(比如说3*3*100,3*3*200add起来变成3*3*300)

    yolo

    [yolo]
    mask = 0,1,2
    anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
    classes=80
    num=9
    jitter=.3
    ignore_thresh = .5
    truth_thresh = 1
    random=1
    

    yolo层负责预测. anchors是9个anchor,事先聚类得到,表示最有可能的anchor形状.
    mask表示哪几组anchor被使用.比如mask=0,1,2代表使用10,13 16,30 30,61这几组anchor. 在原理篇里说过了,每个cell预测3个boudingbox. 三种尺度,总计9种.

    Net

    [net]
    # Testing
    batch=1
    subdivisions=1
    # Training
    # batch=64
    # subdivisions=16
    width= 320
    height = 320
    channels=3
    momentum=0.9
    decay=0.0005
    angle=0
    saturation = 1.5
    exposure = 1.5
    hue=.1
    

    定义了model的输入,batch等等.

    现在开始写代码:

    解析配置文件

    这一步里,做配置文件的解析.把每一块的配置内容存储于一个dict.

    def parse_cfg(cfgfile):
        """
        Takes a configuration file
    
        Returns a list of blocks. Each blocks describes a block in the neural
        network to be built. Block is represented as a dictionary in the list
    
        """
        file = open(cfgfile, 'r')
    	# store the lines in a list
    	lines = file.read().split('
    ')
    	# get read of the empty lines
    	lines = [x for x in lines if len(x) > 0]
    	lines = [x for x in lines if x[0] != '#']              # get rid of comments
    	# get rid of fringe whitespaces
    	lines = [x.rstrip().lstrip() for x in lines]
    
    	block = {}
    	blocks = []
    
    	for line in lines:
    		if line[0] == "[":               # This marks the start of a new block
    			# If block is not empty, implies it is storing values of previous block.
    			if len(block) != 0:
    				blocks.append(block)     # add it the blocks list
    				block = {}               # re-init the block
    			block["type"] = line[1:-1].rstrip()
    		else:
    			key, value = line.split("=")
    			block[key.rstrip()] = value.lstrip()
    	blocks.append(block)
    
    	return blocks
    

    用pytorch创建各个layer

    逐个layer创建.

    def create_modules(blocks):
        # Captures the information about the input and pre-processing
        net_info = blocks[0]
        module_list = nn.ModuleList()
        prev_filters = 3     #卷积的时候需要知道卷积核的depth.卷积核的size在配置文件里定义了.depeth就是上一层的output的depth.
        output_filters = []  #用以保存每一个layer的输出的feature map
    
        #index代表了当前layer位于网络的第几层
    	for index, x in enumerate(blocks[1:]):
            #生成每一个layer
            
            module_list.append(module)
            prev_filters = filters
            output_filters.append(filters)
        
        return(net_info,module_list)    
    
    • 卷积层
    [convolutional]
    batch_normalize=1
    filters=32
    size=3
    stride=1
    pad=1
    activation=leaky
    

    除了卷积之外实际上还包括了bn和leaky.batchnormalize基本成了标配了现在,用来解决梯度消失的问题(反向传播梯度越乘越小).leaky是激活函数RLU.
    所以用到了nn.Sequential()

    module = nn.Sequential()
    module.add_module("conv_{0}".format(index), conv)
    module.add_module("batch_norm_{0}".format(index), bn)
    module.add_module("leaky_{0}".format(index), activn)
    

    卷积层创建完整代码
    涉及到一个python语法enumerate. 就是为一个list中的每个元素添加一个index,形成新的list.

    >>>seasons = ['Spring', 'Summer', 'Fall', 'Winter']
    >>> list(enumerate(seasons))
    [(0, 'Spring'), (1, 'Summer'), (2, 'Fall'), (3, 'Winter')]
    >>> list(enumerate(seasons, start=1))       # 下标从 1 开始
    [(1, 'Spring'), (2, 'Summer'), (3, 'Fall'), (4, 'Winter')]
    

    卷积层创建

        #index代表了当前layer位于网络的第几层
    	for index, x in enumerate(blocks[1:]):
    		module = nn.Sequential()
    
    		#check the type of block
    		#create a new module for the block
    		#append to module_list
    
    		if (x["type"] == "convolutional"):
                #Get the info about the layer
                activation = x["activation"]
                try:
                    batch_normalize = int(x["batch_normalize"])
                    bias = False
                except:
                    batch_normalize = 0
                    bias = True
    
                filters= int(x["filters"])
                padding = int(x["pad"])
                kernel_size = int(x["size"])
                stride = int(x["stride"])
    
                if padding:
                    pad = (kernel_size - 1) // 2
                else:
                    pad = 0
    
                #Add the convolutional layer
                #prev_filters是上一层输出的feature map的depth.比如上层有64个卷积核,则输出为m*n*64
                conv = nn.Conv2d(prev_filters, filters, kernel_size, stride, pad, bias = bias)
                module.add_module("conv_{0}".format(index), conv)
    
                #Add the Batch Norm Layer
                if batch_normalize:
                    bn = nn.BatchNorm2d(filters)
                    module.add_module("batch_norm_{0}".format(index), bn)
    
                #Check the activation. 
                #It is either Linear or a Leaky ReLU for YOLO
                if activation == "leaky":
                    activn = nn.LeakyReLU(0.1, inplace = True)
                    module.add_module("leaky_{0}".format(index), activn)
    
    • upsample层
            #If it's an upsampling layer
            #We use Bilinear2dUpsampling
            elif (x["type"] == "upsample"):
                stride = int(x["stride"])
                upsample = nn.Upsample(scale_factor = 2, mode = "bilinear")
                module.add_module("upsample_{}".format(index), upsample)
    
    • route层
    [route]
    layers = -4
    
    [route]
    layers = -1, 61
    

    首先是解析配置文件,然后将相应层的feature map 连接起来作为输出

            #If it is a route layer
            elif (x["type"] == "route"):
                x["layers"] = x["layers"].split(',')
                #Start  of a route
                start = int(x["layers"][0]) 
                #end, if there exists one.
                try:
                    end = int(x["layers"][1])
                except:
                    end = 0
                #Positive anotation
                if start > 0: 
                    start = start - index   #start转换成相对于当前layer的偏移
                if end > 0:
                    end = end - index       #end转换成相对于当前layer的偏移
                route = EmptyLayer()
                module.add_module("route_{0}".format(index), route)
                if end < 0:   #route层concat当前layer前面的某2个layer,所以index>0是无意义的.
                    filters = output_filters[index + start] + output_filters[index + end]
                else:
                    filters= output_filters[index + start]
    

    这里我们自定义了一个EmptyLayer

    class EmptyLayer(nn.Module):
        def __init__(self):
            super(EmptyLayer, self).__init__()
    

    这里定义EmptyLayer是为了代码的简便起见.在pytorch里定义一个自定义的layer.要写一个类,继承自nn.Module,然后实现forward方法.
    关于如何定义一个自定义layer,参见下面的link.
    https://pytorch.org/tutorials/beginner/examples_nn/two_layer_net_module.html

    import torch
    
    
    class TwoLayerNet(torch.nn.Module):
        def __init__(self, D_in, H, D_out):
            """
            In the constructor we instantiate two nn.Linear modules and assign them as
            member variables.
            """
            super(TwoLayerNet, self).__init__()
            self.linear1 = torch.nn.Linear(D_in, H)
            self.linear2 = torch.nn.Linear(H, D_out)
    
        def forward(self, x):
            """
            In the forward function we accept a Tensor of input data and we must return
            a Tensor of output data. We can use Modules defined in the constructor as
            well as arbitrary operators on Tensors.
            """
            h_relu = self.linear1(x).clamp(min=0)
            y_pred = self.linear2(h_relu)
            return y_pred
    
    
    # N is batch size; D_in is input dimension;
    # H is hidden dimension; D_out is output dimension.
    N, D_in, H, D_out = 64, 1000, 100, 10
    
    # Create random Tensors to hold inputs and outputs
    x = torch.randn(N, D_in)
    y = torch.randn(N, D_out)
    
    # Construct our model by instantiating the class defined above
    model = TwoLayerNet(D_in, H, D_out)
    
    # Construct our loss function and an Optimizer. The call to model.parameters()
    # in the SGD constructor will contain the learnable parameters of the two
    # nn.Linear modules which are members of the model.
    criterion = torch.nn.MSELoss(reduction='sum')
    optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)
    for t in range(500):
        # Forward pass: Compute predicted y by passing x to the model
        y_pred = model(x)
    
        # Compute and print loss
        loss = criterion(y_pred, y)
        print(t, loss.item())
    
        # Zero gradients, perform a backward pass, and update the weights.
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    
    

    这里由于我们的route layer要做的事情很简单,就是concat两个layer里的feature map,调用torch.cat一行代码的事情,所以没必要定义一个RouteLayer了,直接在代表darknet的nn.Module的forward方法里做concat操作就可以啦.

    • shorcut层
            #shortcut corresponds to skip connection
            elif x["type"] == "shortcut":
                shortcut = EmptyLayer()
                module.add_module("shortcut_{}".format(index), shortcut)
    

    和route层类似,这边也用个EmptyLayer替代.shortcut所做操作即对两个feature map做addition.

    • yolo层
      yolo层负责根据feature map做预测
      首先是解析出有效的anchors.然后用我们自己定义的layer保存这些anchors.然后生成一个module.
      涉及到一个python语法super
      详细地看:http://www.runoob.com/python/python-func-super.html 简单地说就是为了安全地继承.记住怎么用的就行了.没必要深究
            #Yolo is the detection layer
            elif x["type"] == "yolo":
                mask = x["mask"].split(",")
                mask = [int(x) for x in mask]
    
                anchors = x["anchors"].split(",")
                anchors = [int(a) for a in anchors]
                anchors = [(anchors[i], anchors[i+1]) for i in range(0, len(anchors),2)]
                anchors = [anchors[i] for i in mask]
    
                detection = DetectionLayer(anchors)
                module.add_module("Detection_{}".format(index), detection)
    
    #我们自己定义了一个yolo层 
    class DetectionLayer(nn.Module):
        def __init__(self, anchors):
            super(DetectionLayer, self).__init__()
            self.anchors = anchors      
    
    

    测试代码

    blocks = parse_cfg("cfg/yolov3.cfg")
    print(create_modules(blocks))
    

    输出如下

    完整代码如下:

    #coding=utf-8
        
    from __future__ import division
    
    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    from torch.autograd import Variable
    import numpy as np
    
    
    def parse_cfg(cfgfile):
        """
        Takes a configuration file
    
        Returns a list of blocks. Each blocks describes a block in the neural
        network to be built. Block is represented as a dictionary in the list
    
        """
        file = open(cfgfile, 'r')
        # store the lines in a list
        lines = file.read().split('
    ')
        # get read of the empty lines
        lines = [x for x in lines if len(x) > 0]
        lines = [x for x in lines if x[0] != '#']              # get rid of comments
        # get rid of fringe whitespaces
        lines = [x.rstrip().lstrip() for x in lines]
    
        block = {}
        blocks = []
    
        for line in lines:
            if line[0] == "[":               # This marks the start of a new block
                # If block is not empty, implies it is storing values of previous block.
                if len(block) != 0:
                    blocks.append(block)     # add it the blocks list
                    block = {}               # re-init the block
                block["type"] = line[1:-1].rstrip()
            else:
                key, value = line.split("=")
                block[key.rstrip()] = value.lstrip()
        blocks.append(block)
    
        return blocks
    
    
    class EmptyLayer(nn.Module):
        def __init__(self):
            super(EmptyLayer, self).__init__()
            
    
    class DetectionLayer(nn.Module):
        def __init__(self, anchors):
            super(DetectionLayer, self).__init__()
            self.anchors = anchors
    
    
    
    def create_modules(blocks):
        # Captures the information about the input and pre-processing
        net_info = blocks[0]
        module_list = nn.ModuleList()
        prev_filters = 3
        output_filters = []
    
        #index代表了当前layer位于网络的第几层
        for index, x in enumerate(blocks[1:]):
            module = nn.Sequential()
    
            #check the type of block
            #create a new module for the block
            #append to module_list
    
            if (x["type"] == "convolutional"):
                #Get the info about the layer
                activation = x["activation"]
                try:
                    batch_normalize = int(x["batch_normalize"])
                    bias = False
                except:
                    batch_normalize = 0
                    bias = True
    
                filters= int(x["filters"])
                padding = int(x["pad"])
                kernel_size = int(x["size"])
                stride = int(x["stride"])
    
                if padding:
                    pad = (kernel_size - 1) // 2
                else:
                    pad = 0
    
                #Add the convolutional layer
                #prev_filters是上一层输出的feature map的depth.比如上层有64个卷积核,则输出为m*n*64
                conv = nn.Conv2d(prev_filters, filters, kernel_size, stride, pad, bias = bias)
                module.add_module("conv_{0}".format(index), conv)
    
                #Add the Batch Norm Layer
                if batch_normalize:
                    bn = nn.BatchNorm2d(filters)
                    module.add_module("batch_norm_{0}".format(index), bn)
    
                #Check the activation. 
                #It is either Linear or a Leaky ReLU for YOLO
                if activation == "leaky":
                    activn = nn.LeakyReLU(0.1, inplace = True)
                    module.add_module("leaky_{0}".format(index), activn)
    
            #If it's an upsampling layer
            #We use Bilinear2dUpsampling
            elif (x["type"] == "upsample"):
                stride = int(x["stride"])
                upsample = nn.Upsample(scale_factor = 2, mode = "bilinear")
                module.add_module("upsample_{}".format(index), upsample)
    
                #If it is a route layer
            elif (x["type"] == "route"):
                x["layers"] = x["layers"].split(',')
                #Start  of a route
                start = int(x["layers"][0])
                #end, if there exists one.
                try:
                    end = int(x["layers"][1])
                except:
                    end = 0
                #Positive anotation
                if start > 0: 
                    start = start - index
                if end > 0:
                    end = end - index
                route = EmptyLayer()
                module.add_module("route_{0}".format(index), route)
                if end < 0:
                    filters = output_filters[index + start] + output_filters[index + end]
                else:
                    filters= output_filters[index + start]
    
            #shortcut corresponds to skip connection
            elif x["type"] == "shortcut":
                shortcut = EmptyLayer()
                module.add_module("shortcut{}".format(index), shortcut)   
            
            #Yolo is the detection layer
            elif x["type"] == "yolo":
                mask = x["mask"].split(",")
                mask = [int(x) for x in mask]
                
                anchors = x["anchors"].split(",")
                anchors = [int(a) for a in anchors]
                anchors = [(anchors[i], anchors[i+1]) for i in range(0, len(anchors),2)]
                anchors = [anchors[i] for i in mask]
    
                detection = DetectionLayer(anchors)
                module.add_module("Detection_{}".format(index), detection)  
    
            module_list.append(module)
            prev_filter = filters
            output_filters.append(filters)
            
        return (net_info,module_list)
    
            
    blocks = parse_cfg("/home/suchang/work_codes/keepgoing/yolov3-torch/cfg/yolov3.cfg")
    print(create_modules(blocks))
    
    
  • 相关阅读:
    查看每个核的资源情况
    什么时候使用NO_UNNEST
    走FILTER效率高的2种情况
    PL/SQL 包头和包体
    产品研发要配合好
    ElasticSearch 文档并发处理以及文档路由
    ES(ElasticSearch) 索引创建
    BaikalDB技术实现内幕(三)--代价模型实现
    腾讯位置服务地图SDK自定义地图和路况
    mysql数据库优化
  • 原文地址:https://www.cnblogs.com/sdu20112013/p/11099244.html
Copyright © 2011-2022 走看看