zoukankan      html  css  js  c++  java
  • caffe之(一)卷积层

    在caffe中,网络的结构由prototxt文件中给出,由一些列的Layer(层)组成,常用的层如:数据加载层、卷积操作层、pooling层、非线性变换层、内积运算层、归一化层、损失计算层等;本篇主要介绍卷积层

    参考

    1. 卷积层总述

    下面首先给出卷积层的结构设置的一个小例子(定义在.prototxt文件中) 

    layer {
    
      name: "conv1" // 该层的名字
      type: "Convolution" // 该层的类型,具体地,可选的类型有:Convolution、
      bottom: "data" // 该层的输入数据Blob的名字
      top: "conv1" // 该层的输出数据Blob的名字
    
      // 该层的权值和偏置相关参数
      param { 
        lr_mult: 1  //weight的学习率
      }
      param {
        lr_mult: 2  // bias的学习率
      }
    
      // 该层(卷积层)的卷积运算相关的参数
      convolution_param {
        num_output: 20
        kernel_size: 5
        stride: 1
        weight_filler {
          type: "xavier"  // weights初始化方法
        }
        bias_filler {
          type: "constant" // bias初始化方法
        }
      }
    
    }

    注:在caffe的原始proto文件中,关于卷积层的参数ConvolutionPraram定义如下:

    message ConvolutionParameter {
      optional uint32 num_output = 1; // The number of outputs for the layer
      optional bool bias_term = 2 [default = true]; // whether to have bias terms
    
      // Pad, kernel size, and stride are all given as a single value for equal dimensions in all spatial dimensions, or once per spatial dimension.
      repeated uint32 pad = 3; // The padding size; defaults to 0
      repeated uint32 kernel_size = 4; // The kernel size
      repeated uint32 stride = 6; // The stride; defaults to 1
      // Factor used to dilate the kernel, (implicitly) zero-filling the resulting holes. (Kernel dilation is sometimes referred to by its use in the algorithme à trous from Holschneider et al. 1987.)
      repeated uint32 dilation = 18; // The dilation; defaults to 1
    
      // For 2D convolution only, the *_h and *_w versions may also be used to specify both spatial dimensions.
      optional uint32 pad_h = 9 [default = 0]; // The padding height (2D only)
      optional uint32 pad_w = 10 [default = 0]; // The padding width (2D only)
      optional uint32 kernel_h = 11; // The kernel height (2D only)
      optional uint32 kernel_w = 12; // The kernel width (2D only)
      optional uint32 stride_h = 13; // The stride height (2D only)
      optional uint32 stride_w = 14; // The stride width (2D only)
    
      optional uint32 group = 5 [default = 1]; // The group size for group conv
    
      optional FillerParameter weight_filler = 7; // The filler for the weight
      optional FillerParameter bias_filler = 8; // The filler for the bias
      enum Engine {
        DEFAULT = 0;
        CAFFE = 1;
        CUDNN = 2;
      }
      optional Engine engine = 15 [default = DEFAULT];
    
      // The axis to interpret as "channels" when performing convolution.
      // Preceding dimensions are treated as independent inputs;
      // succeeding dimensions are treated as "spatial".
      // With (N, C, H, W) inputs, and axis == 1 (the default), we perform
      // N independent 2D convolutions, sliding C-channel (or (C/g)-channels, for
      // groups g>1) filters across the spatial axes (H, W) of the input.
      // With (N, C, D, H, W) inputs, and axis == 1, we perform
      // N independent 3D convolutions, sliding (C/g)-channels
      // filters across the spatial axes (D, H, W) of the input.
      optional int32 axis = 16 [default = 1];
    
      // Whether to force use of the general ND convolution, even if a specific
      // implementation for blobs of the appropriate number of spatial dimensions
      // is available. (Currently, there is only a 2D-specific convolution
      // implementation; for input blobs with num_axes != 2, this option is
      // ignored and the ND implementation will be used.)
      optional bool force_nd_im2col = 17 [default = false];
    }

    2. 卷积层相关参数 

    接下来,分别对卷积层的相关参数进行说明

    (根据卷积层的定义,它的学习参数应该为filter的取值和bias的取值,其他的相关参数都为hyper-paramers,在定义模型时是要给出的)

    lr_mult:学习率系数

    放置在param{}中

    该系数用来控制学习率,在进行训练过程中,该层参数以该系数乘solver.prototxt配置文件中的base_lr的值为学习率

    即学习率=lr_mult*base_lr

    如果该层在结构配置文件中有两个lr_mult,则第一个表示fitler的权值学习率系数,第二个表示偏执项的学习率系数(一般情况下,偏执项的学习率系数是权值学习率系数的两倍)

    convolution_praram:卷积层的其他参数

    放置在convoluytion_param{}中

    该部分对卷积层的其他参数进行设置,有些参数为必须设置,有些参数为可选(因为可以直接使用默认值)

    • 必须设置的参数

    1. num_output:该卷积层的filter个数

    2. kernel_size:卷积层的filter的大小(直接用该参数时,是filter的长宽相等,2D情况时,也可以设置为不能,此时,利用kernel_h和kernel_w两个参数设定)
    • 其他可选的设置参数

    1. stride:filter的步长,默认值为1

    2. pad:是否对输入的image进行padding,默认值为0,即不填充(注意,进行padding可能会带来一些无用信息,输入image较小时,似乎不太合适)
    3. weight_filter:权值初始化方法,使用方法如下
      weight_filter{
            type:"xavier"  //这里的xavier是一冲初始化算法,也可以是“gaussian”;默认值为“constant”,即全部为0
      }
    4. bias_filter:偏执项初始化方法
      bias_filter{
            type:"xavier"  //这里的xavier是一冲初始化算法,也可以是“gaussian”;默认值为“constant”,即全部为0
      }
    5. bias_term:是否使用偏执项,默认值为Ture
  • 相关阅读:
    array_intersect_ukey — 用回调函数比较键名来计算数组的交集
    array_intersect_uassoc — 带索引检查计算数组的交集,用回调函数比较索引
    array_intersect_key — 使用键名比较计算数组的交集
    array_intersect_assoc — 带索引检查计算数组的交集
    array_flip — 交换数组中的键和值
    array_filter — 用回调函数过滤数组中的单元
    array_fill — 用给定的值填充数组
    array_fill_keys — 使用指定的键和值填充数组
    array_diff — 计算数组的差集
    array_diff_ukey — 用回调函数对键名比较计算数组的差集
  • 原文地址:https://www.cnblogs.com/lutingting/p/5240629.html
Copyright © 2011-2022 走看看