zoukankan      html  css  js  c++  java
  • <转>卷积神经网络是如何学习到平移不变的特征

    After some thought, I do not believe that pooling operations are responsible for the translation invariant property in CNNs. I believe that invariance (at least to translation) is due to the convolution filters (not specifically the pooling) and due to the fully-connected layer.

    For instance, let's use the Fig. 1 as reference:

    The blue volume represents the input image, while the green and yellow volumes represent layer 1 and layer 2 output activation volumes (see CS231n Convolutional Neural Networks for Visual Recognition if you are not familiar with these volumes). At the end, we have a fully-connected layer that is connected to all activation points of the yellow volume.

    These volumes are build using a convolution plus a pooling operation. The pooling operation reduces the height and width of these volumes, while the increasing number of filters in each layer increases the volume depth.

    For the sake of the argument, let's suppose that we have very "ludic" filters, as show in Fig. 2:

    • the first layer filters (which will generate the green volume) detect eyes, noses and other basic shapes (in real CNNs, first layer filters will match lines and very basic textures);
    • The second layer filters (which will generate the yellow volume) detect faces, legs and other objects that are aggregations of the first layer filters. Again, this is only an example: real life convolution filters may detect objects that have no meaning to humans.

    Now suppose that there is a face at one of the corners of the image (represented by two red and a magenta point). The two eyes are detected by the first filter, and therefore will represent two activations at the first slice of the green volume. The same happens for the nose, except that it is detected for the second filter and it appears at the second slice. Next, the face filter will find that there are two eyes and a nose next to each other, and it generates an activation at the yellow volume (within the same region of the face at the input image). Finally, the fully-connected layer detects that there is a face (and maybe a leg and an arm detected by other filters) and it outputs that it has detected an human body.

    Now suppose that the face has moved to another corner of the image, as shown in Fig. 3:

    The same number of activations occurs in this example, however they occur in a different region of the green and yellow volumes. Therefore, any activation point at the first slice of the yellow volume means that a face was detected, INDEPENDENTLY of the face location. Then the fully-connected layer is responsible to "translate" a face and two arms to an human body. In both examples, an activation was received at one of the fully-connected neurons. However, in each example, the activation path inside the FC layer was different, meaning that a correct learning at the FC layer is essential to ensure the invariance property.

    It must be noticed that the polling operation only "compresses" the activation volumes, if there was no polling in this example, an activation at the first slice of the yellow volume would still mean a face.

    In conclusion, what makes a CNN invariant to object translation is the architecture of the neural network: the convolution filters and the fully-connected layer. Additionally, I believe that if a CNN is trained showing faces only at one corner, during the learning process, the fully-connected layer may become insensitive to faces in other corners.

     

    source:

    https://www.quora.com/How-is-a-convolutional-neural-network-able-to-learn-invariant-features/answer/Jean-Da-Rolt

     

    An Intuitive Explanation of Convolutional Neural Networks

     

     

  • 相关阅读:
    HDU 3415 Max Sum of Max-K-sub-sequence 最长K子段和
    Android Fragment 真正彻底的解决(下一个)
    【数据分析面试题】一个 面试题,我的回答
    Swift初体验(两)
    MyEclipse10.0 集成 SVN
    CFileDialog 打开文件夹文件 保存文件夹文件
    基于thinkphp的uploadify上传图功能
    近20家银行手机银行签名被非法滥用风险分析
    设计模式【6】:适配器模式【接口适配】
    【学习笔记】编译原理-有限自己主动机
  • 原文地址:https://www.cnblogs.com/objectDetect/p/5992283.html
Copyright © 2011-2022 走看看