zoukankan      html  css  js  c++  java
  • 论文笔记-Squeeze-and-Excitation Networks

    作者提出为了增强网络的表达能力,现有的工作显示了加强空间编码的作用。在这篇论文里面,作者重点关注channel上的信息,提出了“Squeeze-and-Excitation"(SE)block,实际上就是显式的让网络关注channel之间的信息 (adaptively recalibrates channel-wise feature responsesby explicitly modelling interdependencies between channels.)。SEnets取得了ILSVRC2017的第一名, top-5 error 2.251%

    之前的一些架构设计关注空间依赖
    Inception architectures: embedding multi-scale processes in its modules
    Resnet, stack hourglass
    spatial attention: Spatial transformer networks

    作者的设计思路:
    we investigate a different
    aspect of architectural design - the channel relationship

    Our goal is to improve the representational power of a network by explicitly
    modelling the interdependencies between the channels of its
    convolutional features. To achieve this, we propose a mechanism that allows the network to perform feature recalibration, through which it can learn to use global information
    to selectively emphasise informative features and suppress
    less useful ones.
    作者希望能够对卷积特征进行recalibration,根据后文我的理解就是对channel进行加权了。

    相关工作
    网络结构:
    VGGNets, Inception models, BN, Resnet, Densenet, Dual path network
    其他方式:Grouped convolution, Multi-branch convolution, Cross-channel correlations
    This approach reflects an assumption that channel relationships can
    be formulated as a composition of instance-agnostic functions with local receptive fields.

    Attention, gating mechanisms

    SE block

    ({F_{tr}}:X in R{^{W' imes H' imes C'}},{kern 1pt} {kern 1pt} {kern 1pt} {kern 1pt} {kern 1pt} U in {kern 1pt} {kern 1pt} {R^{W imes H imes C}})
    (V = [v_1, v_2, ..., v_C])表示学习到的filter kernel, (v_c)表示第c个filter的参数,那么(F_{tr})的输出(U = [u_1,u_2,...,u_C]):

    [{u_c} = { m{ }}{{ m{v}}_c} * X = sumlimits_{s = 1}^{C'} {v_c^s} * {x^s} ]

    (v_c^s)是一个channel的kernel,一个新产生的channel是原有所有channel与相应的filter kernel卷积的和。channel间的关系隐式的包含在(v_c)中,但是这些信息和空间相关性纠缠在一起了,作者的目标就是让网络更加关注有用的信息。分成了Squeeze和Excitation两步来完成目的。
    Squeeze
    现有网络的问题:由于卷积实在local receptive field做的,因此每个卷积单元只能关注这个field内的空间信息。
    为了减轻这个问题,提出了Squeeze操作将全局的空间信息编码到channel descriptor中,具体而言是通过global average pooling操作完成的。

    [{z_c} = {F_{sq}}({u_c}) = {1 over {W imes H}}sumlimits_{i = 1}^W {sumlimits_{j = 1}^H {{u_c}(i,j)} } ]

    就是求每个channel的均值,作为全局的描述。
    Excitation: Adaptive Recalibration
    为了利用Squeeze得到的信息,提出了第二个op,这个op需要满足2个要求:一个是足够灵活,需要能够学习channel间的非线性关系,另一个就是能够学习non-mutually-exclusive关系,这个词我的理解是非独占性,可能是说多个channnel之间会有各种各样的关系吧。

    [s = {F_{ex}}(z,W) = sigma (g(z,W)) = sigma ({W_2}delta ({W_1}z)) ]

    $delta (是ReLu,){W_1} in {R^{{C over r} imes C}}(,){W_2} in {R^{C imes {C over r}}}(,)W_1(是bottleneck,降低channel数,)W_2(是增加channel数,)gamma(设置为16。最终再将)U(用)s$来scale,其实也就是加权了。这样就得到了一个block的输出。

    [{x_c} = {F_{scale}}({u_c},{s_c}) = {s_c} cdot {u_c} ]

    (F_{scale})表示feature map (u_c in R^{W imes H})(s_c)的channel-wise乘法

    The activations act as channel weights
    adapted to the input-specific descriptor z. In this regard,
    SE blocks intrinsically introduce dynamics conditioned on
    the input, helping to boost feature discriminability

    1. Example
      mark
      SE block可以很方便的加到其他网络结构上。
    2. Mxnet code
    squeeze = mx.sym.Pooling(data=bn3, global_pool=True, kernel=(7, 7), pool_type='avg', name=name + '_squeeze')
    squeeze = mx.symbol.Flatten(data=squeeze, name=name + '_flatten')
    excitation = mx.symbol.FullyConnected(data=squeeze, num_hidden=int(num_filter*ratio), name=name + '_excitation1')#bottleneck
    excitation = mx.sym.Activation(data=excitation, act_type='relu', name=name + '_excitation1_relu')
    excitation = mx.symbol.FullyConnected(data=excitation, num_hidden=num_filter, name=name + '_excitation2')
    excitation = mx.sym.Activation(data=excitation, act_type='sigmoid', name=name + '_excitation2_sigmoid')
    bn3 = mx.symbol.broadcast_mul(bn3, mx.symbol.reshape(data=excitation, shape=(-1, num_filter, 1, 1)))
    
    1. 网络结构
      mark

    2. Experiments
      mark

    参考文献:
    [1] Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks." arXiv preprint arXiv:1709.01507 (2017).

    欢迎关注公众号:vision_home 共同学习,不定期分享论文和资源

  • 相关阅读:
    牛客网 哈尔滨理工大学第七届程序设计竞赛决赛(网络赛-低年级组)求最大值
    HDU 5024 Wang Xifeng's Little Plot(DFS)
    java正则表达式
    48.自用qss
    47.使用帧缓存对象生成叠加
    46.Qt 使用OpenGL绘制立方体
    45.Qt openGL实现三维绘图
    44.Qt通过子类化qstyle实现自定义外观
    43.qt通过qss自定义外观
    42.写入XML
  • 原文地址:https://www.cnblogs.com/haoliuhust/p/8196572.html
Copyright © 2011-2022 走看看