zoukankan      html  css  js  c++  java
  • 【CBAM】2018-ECCV-CBAM: Convolutional block attention module-论文阅读

    CBAM

    2018-ECCV-CBAM: Convolutional block attention module

    来源: ChenBong 博客园

    Introduction

    提出了一种在 channel-wise 和 spatial-wise 的注意力模块,可以嵌入任何CNN,在增加微小的计算开销的情况下,显著提高模型性能。

    Motivation

    • 人类视觉会关注到重要的部分,而不是图片的每个像素

    Contribution

    • 简单高效的 attention 模块(CBMA),可以用来嵌入任何CNN结构

    Method

    image-20210330182603037

    Feature MAP: (mathbf{F} in mathbb{R}^{C imes H imes W})

    1D Channel attention Map: (mathbf{M}_{mathbf{c}} in mathbb{R}^{C imes 1 imes 1})

    2D Spatial attention Map: (mathbf{M}_{mathbf{s}} in mathbb{R}^{1 imes H imes W})

    Feature MAP 先乘 1D 的 Channel attention Map,再乘 2D 的 Spatial attention Map:

    (mathbf{F}^{prime}=mathbf{M}_{mathbf{c}}(mathbf{F}) otimes mathbf{F})
    (mathbf{F}^{prime prime}=mathbf{M}_{mathbf{s}}left(mathbf{F}^{prime} ight) otimes mathbf{F}^{prime})

    Channel attention module

    image-20210330183546526

    (egin{aligned} mathbf{M}_{mathbf{c}}(mathbf{F}) &=sigma(operatorname{MLP}(operatorname{AvgPool}(mathbf{F}))+M L P(operatorname{MaxPool}(mathbf{F}))) \ &=sigmaleft(mathbf{W}_{mathbf{1}}left(mathbf{W}_{mathbf{0}}left(mathbf{F}_{mathbf{a v g}}^{mathbf{c}} ight) ight)+mathbf{W}_{mathbf{1}}left(mathbf{W}_{mathbf{0}}left(mathbf{F}_{max }^{mathbf{c}} ight) ight) ight) end{aligned})

    其中 (mathbf{W_0})(mathbf{W_1}) 是2层的Share MLP的参数

    Spatial attention module

    image-20210330183601054

    (egin{aligned} mathbf{M}_{mathbf{s}}(mathbf{F}) &=sigmaleft(f^{7 imes 7}([operatorname{AvgPool}(mathbf{F}) ; operatorname{MaxPool}(mathbf{F})]) ight) \ &=sigmaleft(f^{7 imes 7}left(left[mathbf{F}_{mathbf{a v g}}^{mathbf{s}} ; mathbf{F}_{mathbf{m a x}}^{mathbf{s}} ight] ight) ight) end{aligned})

    Arrangement of attention modules

    3种组合方式:并行,Channel first,Spatial first

    其中 Channel first 更好

    Experiments

    Ablation studies

    Channel attention

    image-20210330183949101

    Spatial attention

    image-20210330184000606

    Arrangement

    image-20210330184010853

    main result

    Image Classification on ImageNet

    image-20210330184137505 image-20210330184149656

    Object detection on COCO and VOC

    image-20210330184316149 image-20210330184324291

    Attention Visualization (Grad-CAM)

    image-20210330184431795

    Conclusion

    Summary

    pros:

    • 方法简单统一(AvgPool+MaxPool)+MLP/Conv
    • 效果好(Res50上提将近2个点),架构无关,任务无关,通用的模块
    • attention可视化的图画的很好,softmax score 提升明显

    To Read

    Reference

    万字长文:特征可视化技术(CAM) https://zhuanlan.zhihu.com/p/269702192

    CAM和Grad-CAM https://bindog.github.io/blog/2018/02/10/model-explanation/

  • 相关阅读:
    上云,你真的只差一本葵花宝典
    Linux Kernel 4.11首个候选版本开放下载
    Windows 10 host where Credential Guard or Device Guard is enabled fails when running Workstation (2146361)
    .NET技术+25台服务器怎样支撑世界第54大网站
    Azure 订阅和服务限制、配额和约束
    python再议装饰器
    python的上下文管理器-1
    python的上下文管理器
    python小知识点
    python做简易记事本
  • 原文地址:https://www.cnblogs.com/chenbong/p/14609467.html
Copyright © 2011-2022 走看看