zoukankan      html  css  js  c++  java
  • Intel Caffe 与原生Caffe

    1.  首先安装好docker,拉取intel caffe image:

    $ docker pull bvlc/caffe:intel
    试着运行:
    $ docker run -it bvlc/caffe:intel /bin/bash

    2. 拉取 intel caffe 源码:

    git clone https://github.com/intel/caffe 
    git checkout 1.0 
    

     或者下载源码包:

    wget https://github.com/intel/caffe/archive/1.1.0.zip
    unzip 1.0.zip
    

    3. 编译Intel caffe

    sudo apt-get -y install python-devel boost boost-devel cmake numpy 
      numpy-devel gflags gflags-devel glog glog-devel protobuf protobuf-devel hdf5 
      hdf5-devel lmdb lmdb-devel leveldb leveldb-devel snappy-devel opencv opencv-devel
    
    cp Makefile.config.example Makefile.config
    # Adjust Makefile.config (for example, if using Anaconda Python, or if cuDNN is desired)
    

     vim Makefile.config

    # Intel(r) Machine Learning Scaling Library (uncomment to build with MLSL)
    USE_MLSL := 1

    多线程编译:

    $ make -j <number_of_physical_cores> -k

    编译过程中会下载MKL 和MKL-DNN:

    Download mklml_lnx_2018.0.1.20171227.tgz
    git clone --no-checkout https://github.com/01org/mkl-dnn.git /home/ubuntu/yuntong/caffe-master/external/mkldnn/tmp
    

     测试编译结果:

    make test
    make runtest

    4. 下载和创建mnist数据集:  

    cd $CAFFE_ROOT
    ./data/mnist/get_mnist.sh
    ./examples/mnist/create_mnist.sh
    Creating lmdb...

    5. 写入intel caffe docker中的caffe运行路径:  

    vim /home/ubuntu/yuntong/caffe-1.0/examples/mnist/train_lenet.sh

    #!/usr/bin/env sh set -e #./build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt $@ /opt/caffe/build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt $@

    6. 设置CPU模式 vim examples/mnist/lenet_solver.prototxt

    # solver mode: CPU or GPU
    #solver_mode: GPU
    solver_mode: CPU

    7. 运行docker,并在docker中运行mnist训练:

    sudo docker run -v "/home/ubuntu/yuntong/:/opt/caffe/share"  -it bvlc/caffe:intel /bin/bash
    cd /opt/caffe/share/caffe-1.0
    ./examples/mnist/train_lenet.sh

    运行结果如下:  

    ubuntu@k8s-1:~$ sudo docker run -v "/home/ubuntu/yuntong/:/opt/caffe/share"  -it bvlc/caffe:intel /bin/bash
    root@19eaccc415e1:/workspace# cd /opt/caffe/share/caffe-1.0
    root@19eaccc415e1:/opt/caffe/share/caffe-1.0# ./examples/mnist/train_lenet.sh
    I0408 01:33:10.509523    12 caffe.cpp:285] Use CPU.
    I0408 01:33:10.510561    12 solver.cpp:107] Initializing solver from parameters:
    test_iter: 100
    test_interval: 500
    base_lr: 0.01
    display: 100
    max_iter: 10000
    lr_policy: "inv"
    gamma: 0.0001
    power: 0.75
    momentum: 0.9
    weight_decay: 0.0005
    snapshot: 5000
    snapshot_prefix: "examples/mnist/lenet"
    solver_mode: CPU
    net: "examples/mnist/lenet_train_test.prototxt"
    train_state {
      level: 0
      stage: ""
    }
    I0408 01:33:10.511216    12 solver.cpp:153] Creating training net from net file: examples/mnist/lenet_train_test.prototxt
    I0408 01:33:10.523326    12 cpu_info.cpp:453] Processor speed [MHz]: 0
    I0408 01:33:10.523360    12 cpu_info.cpp:456] Total number of sockets: 8
    I0408 01:33:10.523373    12 cpu_info.cpp:459] Total number of CPU cores: 8
    I0408 01:33:10.523385    12 cpu_info.cpp:462] Total number of processors: 8
    I0408 01:33:10.523396    12 cpu_info.cpp:465] GPU is used: no
    I0408 01:33:10.523406    12 cpu_info.cpp:468] OpenMP environmental variables are specified: no
    I0408 01:33:10.523427    12 cpu_info.cpp:471] OpenMP thread bind allowed: yes
    I0408 01:33:10.523437    12 cpu_info.cpp:474] Number of OpenMP threads: 8
    I0408 01:33:10.524194    12 net.cpp:1052] The NetState phase (0) differed from the phase (1) specified by a rule in layer mnist
    I0408 01:33:10.524220    12 net.cpp:1052] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
    I0408 01:33:10.524510    12 net.cpp:207] Initializing net from parameters:
    I0408 01:33:10.524531    12 net.cpp:208]
    name: "LeNet"
    state {
      phase: TRAIN
      level: 0
      stage: ""
    }
    engine: "MKLDNN"
    compile_net_state {
      bn_scale_remove: false
      bn_scale_merge: false
    }
    layer {
      name: "mnist"
      type: "Data"
      top: "data"
      top: "label"
      include {
        phase: TRAIN
      }
      transform_param {
        scale: 0.00390625
      }
      data_param {
        source: "examples/mnist/mnist_train_lmdb"
        batch_size: 64
        backend: LMDB
      }
    }
    layer {
      name: "conv1"
      type: "Convolution"
      bottom: "data"
      top: "conv1"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      convolution_param {
        num_output: 20
        kernel_size: 5
        stride: 1
        weight_filler {
          type: "xavier"
        }
        bias_filler {
          type: "constant"
        }
      }
    }
    layer {
      name: "pool1"
      type: "Pooling"
      bottom: "conv1"
      top: "pool1"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "conv2"
      type: "Convolution"
      bottom: "pool1"
      top: "conv2"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      convolution_param {
        num_output: 50
        kernel_size: 5
        stride: 1
        weight_filler {
          type: "xavier"
        }
        bias_filler {
          type: "constant"
        }
      }
    }
    layer {
      name: "pool2"
      type: "Pooling"
      bottom: "conv2"
      top: "pool2"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "ip1"
      type: "InnerProduct"
      bottom: "pool2"
      top: "ip1"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      inner_product_param {
        num_output: 500
        weight_filler {
          type: "xavier"
        }
        bias_filler {
          type: "constant"
        }
      }
    }
    layer {
      name: "relu1"
      type: "ReLU"
      bottom: "ip1"
      top: "ip1"
    }
    layer {
      name: "ip2"
      type: "InnerProduct"
      bottom: "ip1"
      top: "ip2"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      inner_product_param {
        num_output: 10
        weight_filler {
          type: "xavier"
        }
        bias_filler {
          type: "constant"
        }
      }
    }
    layer {
      name: "loss"
      type: "SoftmaxWithLoss"
      bottom: "ip2"
      bottom: "label"
      top: "loss"
    }
    …………………………………………….
    …………………………………………….
    …………………………………………….
    
    I0408 01:36:28.103435    12 solver.cpp:312] Iteration 7300, loss = 0.0219446
    I0408 01:36:28.103497    12 solver.cpp:333]     Train net output #0: loss = 0.0219446 (* 1 = 0.0219446 loss)
    I0408 01:36:28.103519    12 sgd_solver.cpp:215] Iteration 7300, lr = 0.00662927
    I0408 01:36:30.492499    12 solver.cpp:312] Iteration 7400, loss = 0.00484636
    I0408 01:36:30.492563    12 solver.cpp:333]     Train net output #0: loss = 0.00484634 (* 1 = 0.00484634 loss)
    I0408 01:36:30.492584    12 sgd_solver.cpp:215] Iteration 7400, lr = 0.00660067
    I0408 01:36:32.912159    12 solver.cpp:474] Iteration 7500, Testing net (#0)
    I0408 01:36:33.992708    12 solver.cpp:563]     Test net output #0: accuracy = 0.9905
    I0408 01:36:33.992983    12 solver.cpp:563]     Test net output #1: loss = 0.0301301 (* 1 = 0.0301301 loss)
    I0408 01:36:34.019621    12 solver.cpp:312] Iteration 7500, loss = 0.00250706
    I0408 01:36:34.019706    12 solver.cpp:333]     Train net output #0: loss = 0.00250702 (* 1 = 0.00250702 loss)
    I0408 01:36:34.020164    12 sgd_solver.cpp:215] Iteration 7500, lr = 0.00657236
    I0408 01:36:36.432328    12 solver.cpp:312] Iteration 7600, loss = 0.00537509
    I0408 01:36:36.432528    12 solver.cpp:333]     Train net output #0: loss = 0.00537505 (* 1 = 0.00537505 loss)
    I0408 01:36:36.432566    12 sgd_solver.cpp:215] Iteration 7600, lr = 0.00654433
    I0408 01:36:39.159704    12 solver.cpp:312] Iteration 7700, loss = 0.034624
    I0408 01:36:39.159781    12 solver.cpp:333]     Train net output #0: loss = 0.0346239 (* 1 = 0.0346239 loss)
    I0408 01:36:39.159811    12 sgd_solver.cpp:215] Iteration 7700, lr = 0.00651658
    I0408 01:36:41.873411    12 solver.cpp:312] Iteration 7800, loss = 0.00424178
    I0408 01:36:41.873672    12 solver.cpp:333]     Train net output #0: loss = 0.00424175 (* 1 = 0.00424175 loss)
    I0408 01:36:41.873694    12 sgd_solver.cpp:215] Iteration 7800, lr = 0.00648911
    I0408 01:36:44.552800    12 solver.cpp:312] Iteration 7900, loss = 0.00208136
    I0408 01:36:44.553073    12 solver.cpp:333]     Train net output #0: loss = 0.00208134 (* 1 = 0.00208134 loss)
    I0408 01:36:44.553095    12 sgd_solver.cpp:215] Iteration 7900, lr = 0.0064619
    I0408 01:36:47.132925    12 solver.cpp:474] Iteration 8000, Testing net (#0)
    I0408 01:36:48.254405    12 solver.cpp:563]     Test net output #0: accuracy = 0.9905
    I0408 01:36:48.254935    12 solver.cpp:563]     Test net output #1: loss = 0.0278543 (* 1 = 0.0278543 loss)
    I0408 01:36:48.279563    12 solver.cpp:312] Iteration 8000, loss = 0.0065576
    I0408 01:36:48.279626    12 solver.cpp:333]     Train net output #0: loss = 0.00655758 (* 1 = 0.00655758 loss)
    I0408 01:36:48.279647    12 sgd_solver.cpp:215] Iteration 8000, lr = 0.00643496
    I0408 01:36:50.693308    12 solver.cpp:312] Iteration 8100, loss = 0.0102435
    I0408 01:36:50.694417    12 solver.cpp:333]     Train net output #0: loss = 0.0102435 (* 1 = 0.0102435 loss)
    I0408 01:36:50.694447    12 sgd_solver.cpp:215] Iteration 8100, lr = 0.00640827
    I0408 01:36:53.059345    12 solver.cpp:312] Iteration 8200, loss = 0.0111062
    I0408 01:36:53.059619    12 solver.cpp:333]     Train net output #0: loss = 0.0111061 (* 1 = 0.0111061 loss)
    I0408 01:36:53.059643    12 sgd_solver.cpp:215] Iteration 8200, lr = 0.00638185
    I0408 01:36:55.439267    12 solver.cpp:312] Iteration 8300, loss = 0.0255548
    I0408 01:36:55.439332    12 solver.cpp:333]     Train net output #0: loss = 0.0255548 (* 1 = 0.0255548 loss)
    I0408 01:36:55.439357    12 sgd_solver.cpp:215] Iteration 8300, lr = 0.00635567
    I0408 01:36:57.821687    12 solver.cpp:312] Iteration 8400, loss = 0.00810484
    I0408 01:36:57.821768    12 solver.cpp:333]     Train net output #0: loss = 0.00810483 (* 1 = 0.00810483 loss)
    I0408 01:36:57.821794    12 sgd_solver.cpp:215] Iteration 8400, lr = 0.00632975
    I0408 01:37:00.229344    12 solver.cpp:474] Iteration 8500, Testing net (#0)
    I0408 01:37:01.341504    12 solver.cpp:563]     Test net output #0: accuracy = 0.991
    I0408 01:37:01.341583    12 solver.cpp:563]     Test net output #1: loss = 0.028333 (* 1 = 0.028333 loss)
    I0408 01:37:01.368783    12 solver.cpp:312] Iteration 8500, loss = 0.00672253
    I0408 01:37:01.368850    12 solver.cpp:333]     Train net output #0: loss = 0.00672251 (* 1 = 0.00672251 loss)
    I0408 01:37:01.368876    12 sgd_solver.cpp:215] Iteration 8500, lr = 0.00630407
    I0408 01:37:03.789499    12 solver.cpp:312] Iteration 8600, loss = 0.000701985
    I0408 01:37:03.789630    12 solver.cpp:333]     Train net output #0: loss = 0.000701961 (* 1 = 0.000701961 loss)
    I0408 01:37:03.789660    12 sgd_solver.cpp:215] Iteration 8600, lr = 0.00627864
    I0408 01:37:06.311506    12 solver.cpp:312] Iteration 8700, loss = 0.00329251
    I0408 01:37:06.311738    12 solver.cpp:333]     Train net output #0: loss = 0.00329248 (* 1 = 0.00329248 loss)
    I0408 01:37:06.311763    12 sgd_solver.cpp:215] Iteration 8700, lr = 0.00625344
    I0408 01:37:08.734477    12 solver.cpp:312] Iteration 8800, loss = 0.0011685
    I0408 01:37:08.734781    12 solver.cpp:333]     Train net output #0: loss = 0.00116848 (* 1 = 0.00116848 loss)
    I0408 01:37:08.734805    12 sgd_solver.cpp:215] Iteration 8800, lr = 0.00622847
    I0408 01:37:11.223204    12 solver.cpp:312] Iteration 8900, loss = 0.000881624
    I0408 01:37:11.223266    12 solver.cpp:333]     Train net output #0: loss = 0.000881607 (* 1 = 0.000881607 loss)
    I0408 01:37:11.223289    12 sgd_solver.cpp:215] Iteration 8900, lr = 0.00620374
    I0408 01:37:13.565495    12 solver.cpp:474] Iteration 9000, Testing net (#0)
    I0408 01:37:14.642087    12 solver.cpp:563]     Test net output #0: accuracy = 0.99
    I0408 01:37:14.642159    12 solver.cpp:563]     Test net output #1: loss = 0.0268256 (* 1 = 0.0268256 loss)
    I0408 01:37:14.666667    12 solver.cpp:312] Iteration 9000, loss = 0.011516
    I0408 01:37:14.666734    12 solver.cpp:333]     Train net output #0: loss = 0.011516 (* 1 = 0.011516 loss)
    I0408 01:37:14.666755    12 sgd_solver.cpp:215] Iteration 9000, lr = 0.00617924
    I0408 01:37:17.068984    12 solver.cpp:312] Iteration 9100, loss = 0.00914626
    I0408 01:37:17.069262    12 solver.cpp:333]     Train net output #0: loss = 0.00914625 (* 1 = 0.00914625 loss)
    I0408 01:37:17.069284    12 sgd_solver.cpp:215] Iteration 9100, lr = 0.00615496
    I0408 01:37:19.455351    12 solver.cpp:312] Iteration 9200, loss = 0.00317596
    I0408 01:37:19.455596    12 solver.cpp:333]     Train net output #0: loss = 0.00317595 (* 1 = 0.00317595 loss)
    I0408 01:37:19.455623    12 sgd_solver.cpp:215] Iteration 9200, lr = 0.0061309
    I0408 01:37:21.834389    12 solver.cpp:312] Iteration 9300, loss = 0.00890829
    I0408 01:37:21.835710    12 solver.cpp:333]     Train net output #0: loss = 0.00890827 (* 1 = 0.00890827 loss)
    I0408 01:37:21.835734    12 sgd_solver.cpp:215] Iteration 9300, lr = 0.00610706
    I0408 01:37:24.199872    12 solver.cpp:312] Iteration 9400, loss = 0.0232409
    I0408 01:37:24.199946    12 solver.cpp:333]     Train net output #0: loss = 0.0232409 (* 1 = 0.0232409 loss)
    I0408 01:37:24.199970    12 sgd_solver.cpp:215] Iteration 9400, lr = 0.00608343
    I0408 01:37:26.601363    12 solver.cpp:474] Iteration 9500, Testing net (#0)
    I0408 01:37:27.673274    12 solver.cpp:563]     Test net output #0: accuracy = 0.989
    I0408 01:37:27.673359    12 solver.cpp:563]     Test net output #1: loss = 0.0323742 (* 1 = 0.0323742 loss)
    I0408 01:37:27.698536    12 solver.cpp:312] Iteration 9500, loss = 0.00388906
    I0408 01:37:27.698603    12 solver.cpp:333]     Train net output #0: loss = 0.00388905 (* 1 = 0.00388905 loss)
    I0408 01:37:27.698628    12 sgd_solver.cpp:215] Iteration 9500, lr = 0.00606002
    I0408 01:37:30.146077    12 solver.cpp:312] Iteration 9600, loss = 0.00205984
    I0408 01:37:30.146361    12 solver.cpp:333]     Train net output #0: loss = 0.00205983 (* 1 = 0.00205983 loss)
    I0408 01:37:30.146386    12 sgd_solver.cpp:215] Iteration 9600, lr = 0.00603682
    I0408 01:37:32.567978    12 solver.cpp:312] Iteration 9700, loss = 0.00330913
    I0408 01:37:32.568212    12 solver.cpp:333]     Train net output #0: loss = 0.00330913 (* 1 = 0.00330913 loss)
    I0408 01:37:32.568235    12 sgd_solver.cpp:215] Iteration 9700, lr = 0.00601382
    I0408 01:37:34.955097    12 solver.cpp:312] Iteration 9800, loss = 0.0134696
    I0408 01:37:34.955363    12 solver.cpp:333]     Train net output #0: loss = 0.0134696 (* 1 = 0.0134696 loss)
    I0408 01:37:34.955386    12 sgd_solver.cpp:215] Iteration 9800, lr = 0.00599102
    I0408 01:37:37.377465    12 solver.cpp:312] Iteration 9900, loss = 0.00235391
    I0408 01:37:37.377655    12 solver.cpp:333]     Train net output #0: loss = 0.0023539 (* 1 = 0.0023539 loss)
    I0408 01:37:37.377678    12 sgd_solver.cpp:215] Iteration 9900, lr = 0.00596843
    I0408 01:37:39.850847    12 solver.cpp:707] Snapshot begin
    I0408 01:37:39.859346    12 solver.cpp:769] Snapshotting to binary proto file examples/mnist/lenet_iter_10000.caffemodel
    I0408 01:37:39.869576    12 sgd_solver.cpp:754] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_10000.solverstate
    I0408 01:37:39.878753    12 solver.cpp:734] Snapshot end
    I0408 01:37:39.888120    12 solver.cpp:436] Iteration 10000, loss = 0.00251002
    I0408 01:37:39.888172    12 solver.cpp:474] Iteration 10000, Testing net (#0)
    I0408 01:37:41.067348    12 solver.cpp:563]     Test net output #0: accuracy = 0.9913
    I0408 01:37:41.067407    12 solver.cpp:563]     Test net output #1: loss = 0.0267652 (* 1 = 0.0267652 loss)
    I0408 01:37:41.067422    12 solver.cpp:443] Optimization Done.
    I0408 01:37:41.067432    12 caffe.cpp:345] Optimization Done.

    花费时间 01:37:41.067432 - 01:33:10.509523 = 4.31分钟

    CPU及IO利用率:

    8个CPU基本达到100%

    IO很小,MNIST数据集只有几十M,数据都被cache了

    8. 加上MKL2017

    ./examples/mnist/train_lenet.sh -engine "MKL2017"

     

    第一次: 03:01:35.904659 -02:58:30.774215 = 2:55分钟

    第二次: 03:05:15.134409 - 03:02:13.449990 = 2:58分钟

    对于原生Caffe

    docker run -ti bvlc/caffe:cpu caffe –version
    
    sudo docker run -v "/home/ubuntu/yuntong/:/opt/caffe/share"  -it bvlc/caffe:cpu /bin/bash
    ./examples/mnist/train_lenet.sh  

    运行时间24分钟。

     

    原生caffe 只能在一个线程上跑

    运行cifar10

    该数据集共有60000张彩色图像,这些图像是32*32,分为10个类,每类6000张图。这里面有50000张用于训练,构成了5个训练批,每一批10000张图;另外10000用于测试,单独构成一批。测试批的数据里,取自10类中的每一类,每一类随机取1000张。抽剩下的就随机排列组成了训练批。注意一个训练批中的各类图像并不一定数量相同,总的来看训练批,每一类都有5000张图。

    下面这幅图就是列举了10各类,每一类展示了随机的10张图片:

    1. 下载和创建cifar10数据集:  

    cd $CAFFE_ROOT
    ./data/mnist/get_cifar10.sh
    ./examples/cifar10/create_cifar10.sh
    Creating lmdb...

    2. 写入intel caffe docker中的caffe运行路径: ~/yuntong/caffe-1.0/examples$ vim cifar10/train_quick.sh

    TOOLS=/opt/caffe/build/tools
    

    3. 设置CPU模式:

    vim cifar10_quick_solver_lr1.prototxt  cifar10_quick_solver.prototxt
    

    4. Intel Caffe里面运行:

    sudo docker run -v "/home/ubuntu/yuntong/:/opt/caffe/share"  -it bvlc/caffe:intel /bin/bash
    cd /opt/caffe/share/caffe-1.0
    
    ./examples/cifar10/train_quick.sh 
    

     

    训练时间: 07:20:47.795905  - 07:08:08.193487 = 12分40

    5. 原生 Caffe里面运行:

    sudo docker run -v "/home/ubuntu/yuntong/:/opt/caffe/share"  -it bvlc/caffe:cpu /bin/bash
    cd /opt/caffe/share/caffe-1.0
    
    ./examples/cifar10/train_quick.sh 
    
    07:26:23.522944
    ……………
    I0408 08:56:53.116524    18 solver.cpp:310] Iteration 5000, loss = 0.449847
    I0408 08:56:53.117141    18 solver.cpp:330] Iteration 5000, Testing net (#0)
    I0408 08:57:30.968313    21 data_layer.cpp:73] Restarting data prefetching from start.
    I0408 08:57:32.527096    18 solver.cpp:397]     Test net output #0: accuracy = 0.7561
    I0408 08:57:32.527354    18 solver.cpp:397]     Test net output #1: loss = 0.72683 (* 1 = 0.72683 loss)
    I0408 08:57:32.527364    18 solver.cpp:315] Optimization Done.
    I0408 08:57:32.527381    18 caffe.cpp:259] Optimization Done.
    

    用时 1.5小时

    机器配置:

    CPU: 8 * Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz
    
    Memory: 16G
    
    Storage: SATA SSD
  • 相关阅读:
    go语言学习笔记四(函数、包和错误处理)
    objection内存漫游实战
    脱壳工具FRIDA-DEXDump
    jsdom 用法技巧
    关于adb安装指定版本
    ob混淆
    js原型链hook
    js逆向核心:扣代码2
    ssl_logger捕获得物app双向验证数据
    js逆向核心:扣代码
  • 原文地址:https://www.cnblogs.com/allcloud/p/8743169.html
Copyright © 2011-2022 走看看