zoukankan      html  css  js  c++  java
  • caffe笔记之例程学习

    学习notebook自带例程Classification with HDF5 data时遇到了一些问题,认真把模型文件看了一遍。

    模型定义中有一点比较容易被误解,信号在有向图中是自下而上流动的,并不是自上而下。

    层的结构定义如下:

     1 name:层名称 2 type:层类型 3 top:出口 4 bottom:入口 

    Each layer type defines three critical computations: setup, forward, and backward.

    • Setup: initialize the layer and its connections once at model initialization.
    • Forward: given input from bottom compute the output and send to the top.
    • Backward: given the gradient w.r.t. the top output compute the gradient w.r.t. to the input and send to the bottom. A layer with parameters computes the gradient w.r.t. to its parameters and stores it internally.

    From: caffe_root/examples/hdf5_classification/train_val.prototxt

     1 #From: caffe_root/examples/hdf5_classification/train_val.prototxt
     2 #计算Logistic函数
     3 name: "LogisticRegressionNet" 
     4 #训练数据层,put in HDF5Data
     5 layer {
     6   name: "data"
     7   type: "HDF5Data"
     8   top: "data"
     9   top: "label"
    10   include {
    11     phase: TRAIN
    12   }
    13   hdf5_data_param {
    14     source: "examples/hdf5_classification/data/train.txt"
    15     batch_size: 10
    16   }
    17 }
    18 #测试数据层,put in HDF5Data
    19 layer {
    20   name: "data"
    21   type: "HDF5Data"
    22   top: "data"
    23   top: "label"
    24   include {
    25     phase: TEST
    26   }
    27   hdf5_data_param {
    28     source: "examples/hdf5_classification/data/test.txt"
    29     batch_size: 10
    30   }
    31 }
    32 #全连接层
    33 layer {
    34   name: "fc1"
    35   type: "InnerProduct"
    36   bottom: "data"
    37   top: "fc1"
    38   param {
    39     lr_mult: 1        # learning rate multiplier for the filters
    40     decay_mult: 1    # learning rate multiplier for the biases
    41   }
    42   param {
    43     lr_mult: 2        # weight decay multiplier for the filters
    44     decay_mult: 0    # weight decay multiplier for the biases
    45   }
    46   inner_product_param {
    47     num_output: 2
    48     weight_filler {
    49       type: "gaussian"
    50       std: 0.01
    51     }
    52     bias_filler {
    53       type: "constant"
    54       value: 0
    55     }
    56   }
    57 }
    58 #loss函数层
    59 layer {
    60   name: "loss"
    61   type: "SoftmaxWithLoss"    #计算loss函数的方法
    62   bottom: "fc1"
    63   bottom: "label"
    64   top: "loss"
    65 }
    66 #ACCURACY层
    67 layer {
    68   name: "accuracy"
    69   type: "Accuracy"
    70   bottom: "fc1"
    71   bottom: "label"
    72   top: "accuracy"
    73   include {
    74     phase: TEST
    75   }
    76 }

    模型详解(实在懒得翻译了):

    1.HD5 Input:

    1 HDF5 Input
    2 
    3     LayerType: HDF5_DATA
    4     Parameters
    5         Required
    6             source: the name of the file to read from
    7             batch_size

    2.Inner Product(全连接层,计算向量内积):

     1 Inner Product
     2 
     3     LayerType: INNER_PRODUCT
     4     CPU implementation: ./src/caffe/layers/inner_product_layer.cpp
     5     CUDA GPU implementation: ./src/caffe/layers/inner_product_layer.cu
     6     Parameters (InnerProductParameter inner_product_param)
     7         Required
     8             num_output (c_o): the number of filters
     9         Strongly recommended
    10             weight_filler [default type: 'constant' value: 0]
    11         Optional
    12             bias_filler [default type: 'constant' value: 0]
    13             bias_term [default true]: specifies whether to learn and apply a set of additive biases to the filter outputs
    14     Input
    15         n * c_i * h_i * w_i
    16     Output
    17         n * c_o * 1 * 1
    18 
    19     Sample
    20 
    21     layers {
    22       name: "fc8"
    23       type: INNER_PRODUCT
    24       blobs_lr: 1          # learning rate multiplier for the filters
    25       blobs_lr: 2          # learning rate multiplier for the biases
    26       weight_decay: 1      # weight decay multiplier for the filters
    27       weight_decay: 0      # weight decay multiplier for the biases
    28       inner_product_param {
    29         num_output: 1000
    30         weight_filler {
    31           type: "gaussian"
    32           std: 0.01
    33         }
    34         bias_filler {
    35           type: "constant"
    36           value: 0
    37         }
    38       }
    39       bottom: "fc7"
    40       top: "fc8"
    41     }
    42 
    43 The INNER_PRODUCT layer (also usually referred to as the fully connected layer) treats the input as a simple vector and produces an output in the form of a single vector (with the blob’s height and width set to 1).

    3.Loss(最基本的loss函数):

     1 Loss
     2 
     3 In Caffe, as in most of machine learning, learning is driven by a loss function (also known as an error, cost, or objective function). A loss function specifies the goal of learning by mapping parameter settings (i.e., the current network weights) to a scalar value specifying the “badness” of these parameter settings. Hence, the goal of learning is to find a setting of the weights that minimizes the loss function.
     4 
     5 The loss in Caffe is computed by the Forward pass of the network. Each layer takes a set of input (bottom) blobs and produces a set of output (top) blobs. Some of these layers’ outputs may be used in the loss function. A typical choice of loss function for one-versus-all classification tasks is the SOFTMAX_LOSS function, used in a network definition as follows, for example:
     6 /***********************************************************************/
     7 layers {
     8   name: "loss"
     9   type: SOFTMAX_LOSS
    10   bottom: "pred"
    11   bottom: "label"
    12   top: "loss"
    13 }
    14 
    15 In a SOFTMAX_LOSS function, the top blob is a scalar (dimensions 1×1×1×1) which averages the loss (computed from predicted labels pred and actuals labels label) over the entire mini-batch.
    16 
    

    4.Accuracy(和loss功能类似,但只是简单做差,并不求和):

     1 Accuracy and Top-k
     2  
     3 ACCURACY scores the output as the accuracy of output with respect to target – it is not actually a loss and has no backward step.
     4  Activation / Neuron Layers
     5  
     6 In general, activation / Neuron layers are element-wise operators, taking one bottom blob and producing one top blob of the same size. In the layers below, we will ignore the input and out sizes as they are identical:
     7  
     8     Input
     9         n * c * h * w
    10     Output
    11         n * c * h * w

    嗯。这是所有例程中最简单的一个模型。天要亮了,洗把脸去。

    References:

    1.caffe.berkeleyvision.org/tutorial/forward_backward.html

    2.caffe.berkeleyvision.org/tutorial/layers.html

    3.nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/hdf5_classification.ipynb

  • 相关阅读:
    redis集群
    鉴权方案选择
    spring mvc 自定义handler不拦截静态资源
    servlet3
    压测工具 ab jmeter
    死锁产生的原因
    缓存方案:本地guavaCache, 远程redis?
    使用spring boot admin
    groovy使用小记
    python--面试题01
  • 原文地址:https://www.cnblogs.com/nwpuxuezha/p/4297298.html
Copyright © 2011-2022 走看看