zoukankan      html  css  js  c++  java
  • caffe使用自己的数据做分类

    这里只举一个例子: Alexnet网络训练自己数据的过程

    用AlexNet跑自己的数据
    参考1:http://blog.csdn.net/gybheroin/article/details/54095399
    参考2:http://www.cnblogs.com/alexcai/p/5469436.html
    1,准备数据;
    在caffe根目录下data文件夹新建一个文件夹,名字自己起一个就行了,我起的名字是food,在food文件夹下新建两个文件夹,分别存放train和val数据,
    在train文件夹下存放要分类的数据toast, pizza等,要分几类就建立几个文件夹,分别把对应的图像放进去。(当然,也可以把所有的图像都放在一个文件夹下,只是在标签文件中标明就行)。
    ./data (food) -> ./data/food (train val) -> ./data/food/train (pizza sandwich 等等) ./data/food/val (pizza sandwich 等等)
    然后在food目录下生成建立train.txt和val.txt category.txt
    --- train.txt 和val.txt 内容类似为:
    toast/62.jpg 0
    toast/107.jpg 0
    toast/172.jpg 0
    pizza/62.jpg 1
    pizza/107.jpg 1
    pizza/172.jpg 1
    --- category.txt内容类似为:
    0 toast
    1 pizza
    
    
    注:图片需要分两批:训练集(train)、测试集(test),一般训练集与测试集的比例大概是5:1以上,此外每个分类的图片也不能太少,我这里每个分类大概选了5000张训练图+1000张测试图。
    
    2,lmdb制作(也可以不制作lmdb数据类型,需要在train的配置文件中data layer 的type改为:type: "ImageData" ###可以直接使用图像训练)
    编译成功的caffe根目录下bin文件夹下有一个convert_imageset.exe文件,用来转换数据,在food文件夹下新建一个脚本文件create_foodnet.sh,内容参考example/imagenet/create_imagenet.sh
    
    #!/usr/bin/env sh
    # Create the imagenet lmdb inputs
    # N.B. set the path to the imagenet train + val data dirs
    set -e
    
    EXAMPLE=data/food  # the path of generated lmdb data
    DATA=data/food  # the txt path of train and test data
    TOOLS=build/tools
    
    TRAIN_DATA_ROOT=/path/to/imagenet/train/    # /path/to/imagenet/train/
    VAL_DATA_ROOT=/path/to/imagenet/val/
    
    # Set RESIZE=true to resize the images to 256x256. Leave as false if images have
    # already been resized using another tool.
    RESIZE=false
    if $RESIZE; then
      RESIZE_HEIGHT=256
      RESIZE_WIDTH=256
    else
      RESIZE_HEIGHT=0
      RESIZE_WIDTH=0
    fi
    
    if [ ! -d "$TRAIN_DATA_ROOT" ]; then
      echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
      echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" 
           "where the ImageNet training data is stored."
      exit 1
    fi
    
    if [ ! -d "$VAL_DATA_ROOT" ]; then
      echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
      echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" 
           "where the ImageNet validation data is stored."
      exit 1
    fi
    
    echo "Creating train lmdb..."
    
    GLOG_logtostderr=1 $TOOLS/convert_imageset 
        --resize_height=$RESIZE_HEIGHT 
        --resize_width=$RESIZE_WIDTH 
        --shuffle 
        $TRAIN_DATA_ROOT 
        $DATA/train.txt 
        $EXAMPLE/food_train_lmdb   #生成的lmdb路径
    
    echo "Creating val lmdb..."
    
    GLOG_logtostderr=1 $TOOLS/convert_imageset 
        --resize_height=$RESIZE_HEIGHT 
        --resize_width=$RESIZE_WIDTH 
        --shuffle 
        $VAL_DATA_ROOT 
        $DATA/val.txt 
        $EXAMPLE/food_val_lmdb     #生成的lmdb路径
    
    echo "Done."
    
    
    3,mean_binary生成
    
    下面我们用lmdb生成mean_file,用于训练
    EXAMPLE=data/food
    DATA=data/food
    TOOLS=build/tools
    $TOOLS/compute_image_mean $EXAMPLE/food_train_lmdb $DATA/foodnet_mean.binaryproto
    
    4,solver 和train网络修改
    
    ------ Solver.prototxt详解:
    # 表示网络的测试迭代次数。网络一次迭代将一个batchSize的图片进行测试,
    # 所以为了能将validation集中所有图片都测试一次,这个参数乘以TEST的batchSize
    # 应该等于validation集中图片总数量。即test_iter*batchSize=val_num。
    test_iter: 299  
    
    # 表示网络迭代多少次进行一次测试。一次迭代即一个batchSize的图片通过网络
    # 正向传播和反向传播的整个过程。比如这里设置的是224,即网络每迭代224次即
    # 对网络的准确率进行一次验证。一般来说,我们需要将训练集中所有图片都跑一
    # 编,再对网络的准确率进行测试,整个参数乘以网络data层(TRAIN)中batchSize
    # 参数应该等于训练集中图片总数量。即test_interval*batchSize=train_num
    test_interval: 224
    
    # 表示网络的基础学习率。学习率过高可能导致loss持续86.33333,也可能导致
    # loss无法收敛等等问题。过低的学习率会使网络收敛慢,也有可能导致梯度损失。
    # 一般我们设置为0.01  
    base_lr: 0.01  
    display: 20  
    max_iter: 6720  
    lr_policy: "step"  
    gamma: 0.1  
    momentum: 0.9   #动量,上次参数更新的权重
    weight_decay: 0.0001  
    stepsize: 2218  #每stpesize之后降低学习率
    snapshot: 224   # 每多少次保存一次学习的结果。即caffemodel
    snapshot_prefix: "food/food_net/food_alex_snapshot"     #快照路径和前缀
    solver_mode: GPU  
    net: "train_val.prototxt"  # 网络结构的文件路径。
    solver_type: SGD  
    
    ----- train_val.prototxt 修改
    ###### Data层为原图像格式。设置主要是data层不同(原图像作为输入)
    layer {
      name: "data"
      type: "ImageData" ###注意是ImageData,可以直接使用图像训练
      top: "data"
      top: "label"
      include {
        phase: TRAIN
      }
    
    image_data_param { ###
        source: "examples/finetune_myself/train.txt"  ###
        batch_size: 50
        new_height: 256 ###
        new_ 256 ###
      }
      
    ##### data层为lmdb格式.(制作的lmdb格式作为输入)
    layer {
      name: "data"
      type: "Data" ###这里是data,使用转换为lmdb的图像之后训练
      top: "data"
      top: "label"
      include {
        phase: TRAIN
      }
    
      data_param {  ###
        source: "examples/imagenet/car_train_lmdb"###
        batch_size: 256 
        backend: LMDB ###
      }
      
    整个网络结构为:
    name: "AlexNet"
    layer {
      name: "data"
      type: "Data"
      top: "data"
      top: "label"
      include {
        phase: TRAIN
      }
      transform_param {
        mirror: true
        crop_size: 227
        mean_file: "mimg_mean.binaryproto" #均值文件
      }
      data_param {
        source: "mtrainldb"  #训练数据
        batch_size: 256
        backend: LMDB
      }
    }
    layer {
      name: "data"
      type: "Data"
      top: "data"
      top: "label"
      include {
        phase: TEST
      }
      transform_param {
        mirror: false
        crop_size: 227
        mean_file: "mimg_mean.binaryproto"  #均值文件
      }
      data_param {
        source: "mvaldb"   #验证数据
        batch_size: 50
        backend: LMDB
      }
    }
    layer {
      name: "conv1"
      type: "Convolution"
      bottom: "data"
      top: "conv1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 96
        kernel_size: 11
        stride: 4
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "relu1"
      type: "ReLU"
      bottom: "conv1"
      top: "conv1"
    }
    layer {
      name: "norm1"
      type: "LRN"
      bottom: "conv1"
      top: "norm1"
      lrn_param {
        local_size: 5
        alpha: 0.0001
        beta: 0.75
      }
    }
    layer {
      name: "pool1"
      type: "Pooling"
      bottom: "norm1"
      top: "pool1"
      pooling_param {
        pool: MAX
        kernel_size: 3
        stride: 2
      }
    }
    layer {
      name: "conv2"
      type: "Convolution"
      bottom: "pool1"
      top: "conv2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 256
        pad: 2
        kernel_size: 5
        group: 2
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0.1
        }
      }
    }
    layer {
      name: "relu2"
      type: "ReLU"
      bottom: "conv2"
      top: "conv2"
    }
    layer {
      name: "norm2"
      type: "LRN"
      bottom: "conv2"
      top: "norm2"
      lrn_param {
        local_size: 5
        alpha: 0.0001
        beta: 0.75
      }
    }
    layer {
      name: "pool2"
      type: "Pooling"
      bottom: "norm2"
      top: "pool2"
      pooling_param {
        pool: MAX
        kernel_size: 3
        stride: 2
      }
    }
    layer {
      name: "conv3"
      type: "Convolution"
      bottom: "pool2"
      top: "conv3"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 384
        pad: 1
        kernel_size: 3
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "relu3"
      type: "ReLU"
      bottom: "conv3"
      top: "conv3"
    }
    layer {
      name: "conv4"
      type: "Convolution"
      bottom: "conv3"
      top: "conv4"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 384
        pad: 1
        kernel_size: 3
        group: 2
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0.1
        }
      }
    }
    layer {
      name: "relu4"
      type: "ReLU"
      bottom: "conv4"
      top: "conv4"
    }
    layer {
      name: "conv5"
      type: "Convolution"
      bottom: "conv4"
      top: "conv5"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 256
        pad: 1
        kernel_size: 3
        group: 2
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0.1
        }
      }
    }
    layer {
      name: "relu5"
      type: "ReLU"
      bottom: "conv5"
      top: "conv5"
    }
    layer {
      name: "pool5"
      type: "Pooling"
      bottom: "conv5"
      top: "pool5"
      pooling_param {
        pool: MAX
        kernel_size: 3
        stride: 2
      }
    }
    layer {
      name: "fc6"
      type: "InnerProduct"
      bottom: "pool5"
      top: "fc6"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      inner_product_param {
        num_output: 4096
        weight_filler {
          type: "gaussian"
          std: 0.005
        }
        bias_filler {
          type: "constant"
          value: 0.1
        }
      }
    }
    layer {
      name: "relu6"
      type: "ReLU"
      bottom: "fc6"
      top: "fc6"
    }
    layer {
      name: "drop6"
      type: "Dropout"
      bottom: "fc6"
      top: "fc6"
      dropout_param {
        dropout_ratio: 0.5
      }
    }
    layer {
      name: "fc7"
      type: "InnerProduct"
      bottom: "fc6"
      top: "fc7"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      inner_product_param {
        num_output: 4096
        weight_filler {
          type: "gaussian"
          std: 0.005
        }
        bias_filler {
          type: "constant"
          value: 0.1
        }
      }
    }
    layer {
      name: "relu7"
      type: "ReLU"
      bottom: "fc7"
      top: "fc7"
    }
    layer {
      name: "drop7"
      type: "Dropout"
      bottom: "fc7"
      top: "fc7"
      dropout_param {
        dropout_ratio: 0.5
      }
    }
    layer {
      name: "fc8"
      type: "InnerProduct"
      bottom: "fc7"
      top: "fc8"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      inner_product_param {
        num_output: 2       #注意:这里需要改成你要分成的类的个数
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "accuracy"
      type: "Accuracy"
      bottom: "fc8"
      bottom: "label"
      top: "accuracy"
      include {
        phase: TEST
      }
    }
    layer {
      name: "loss"
      type: "SoftmaxWithLoss"
      bottom: "fc8"
      bottom: "label"
      top: "loss"
    }
    
    运行以下脚本进行train
    #!/usr/bin/env sh
    set -e
    
    ./build/tools/caffe train 
        --solver=food/food_alexnet/solver.prototxt
        
    5、测试 
    同样,测试需要一个类别标签文件,category.txt,文件内容同上,修改deploy.prototxt 开始测试:
    ./bin/classification "food/foodnet/deploy.prototxt" "food/foodnet/food_iter_100000.caffemodel" "ming_mean.binaryproto" "test001.jpg"
    
    ------------------------------------    
    ---------------- FineTune:
    http://www.cnblogs.com/denny402/p/5074212.html
    http://www.cnblogs.com/alexcai/p/5469478.html
    1,注意finetune的时候,最后一层的连接层的名字需要做修改,类别数需要修改,并且学习率应该比较大,因为只有这层的权值是重新训练的,而其他的都是已经训练好了的
    2、开始训练的时候,最后制定的模型为将要finetune的模型
    ./build/tools/caffe train -solver examples/money_test/fine_tune/solver.prototxt -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel
    其中model指定的是caffenet训练好的model。
  • 相关阅读:
    VLAN
    Debug出错 Release没问题
    弹出VIEW.非dialog
    [转帖]Android Bitmap内存限制OOM,Out Of Memory
    关于性格内向者的10个误解,献给奋战在一线的程序员
    Android 调用webservice
    Android2.2新特性.APK安装参数installLocation
    数据结构中的基本排序算法总结
    Post Operation for Windows 8 Apps || Windows Phone
    ZipArchive For Windows 8 Apps
  • 原文地址:https://www.cnblogs.com/hansjorn/p/7496327.html
Copyright © 2011-2022 走看看