zoukankan      html  css  js  c++  java
  • Intel Caffe 与原生Caffe

    1.  首先安装好docker,拉取intel caffe image:

    $ docker pull bvlc/caffe:intel
    试着运行:
    $ docker run -it bvlc/caffe:intel /bin/bash

    2. 拉取 intel caffe 源码:

    git clone https://github.com/intel/caffe 
    git checkout 1.0 
    

     或者下载源码包:

    wget https://github.com/intel/caffe/archive/1.1.0.zip
    unzip 1.0.zip
    

    3. 编译Intel caffe

    sudo apt-get -y install python-devel boost boost-devel cmake numpy 
      numpy-devel gflags gflags-devel glog glog-devel protobuf protobuf-devel hdf5 
      hdf5-devel lmdb lmdb-devel leveldb leveldb-devel snappy-devel opencv opencv-devel
    
    cp Makefile.config.example Makefile.config
    # Adjust Makefile.config (for example, if using Anaconda Python, or if cuDNN is desired)
    

     vim Makefile.config

    # Intel(r) Machine Learning Scaling Library (uncomment to build with MLSL)
    USE_MLSL := 1

    多线程编译:

    $ make -j <number_of_physical_cores> -k

    编译过程中会下载MKL 和MKL-DNN:

    Download mklml_lnx_2018.0.1.20171227.tgz
    git clone --no-checkout https://github.com/01org/mkl-dnn.git /home/ubuntu/yuntong/caffe-master/external/mkldnn/tmp
    

     测试编译结果:

    make test
    make runtest

    4. 下载和创建mnist数据集:  

    cd $CAFFE_ROOT
    ./data/mnist/get_mnist.sh
    ./examples/mnist/create_mnist.sh
    Creating lmdb...

    5. 写入intel caffe docker中的caffe运行路径:  

    vim /home/ubuntu/yuntong/caffe-1.0/examples/mnist/train_lenet.sh

    #!/usr/bin/env sh set -e #./build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt $@ /opt/caffe/build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt $@

    6. 设置CPU模式 vim examples/mnist/lenet_solver.prototxt

    # solver mode: CPU or GPU
    #solver_mode: GPU
    solver_mode: CPU

    7. 运行docker,并在docker中运行mnist训练:

    sudo docker run -v "/home/ubuntu/yuntong/:/opt/caffe/share"  -it bvlc/caffe:intel /bin/bash
    cd /opt/caffe/share/caffe-1.0
    ./examples/mnist/train_lenet.sh

    运行结果如下:  

    ubuntu@k8s-1:~$ sudo docker run -v "/home/ubuntu/yuntong/:/opt/caffe/share"  -it bvlc/caffe:intel /bin/bash
    root@19eaccc415e1:/workspace# cd /opt/caffe/share/caffe-1.0
    root@19eaccc415e1:/opt/caffe/share/caffe-1.0# ./examples/mnist/train_lenet.sh
    I0408 01:33:10.509523    12 caffe.cpp:285] Use CPU.
    I0408 01:33:10.510561    12 solver.cpp:107] Initializing solver from parameters:
    test_iter: 100
    test_interval: 500
    base_lr: 0.01
    display: 100
    max_iter: 10000
    lr_policy: "inv"
    gamma: 0.0001
    power: 0.75
    momentum: 0.9
    weight_decay: 0.0005
    snapshot: 5000
    snapshot_prefix: "examples/mnist/lenet"
    solver_mode: CPU
    net: "examples/mnist/lenet_train_test.prototxt"
    train_state {
      level: 0
      stage: ""
    }
    I0408 01:33:10.511216    12 solver.cpp:153] Creating training net from net file: examples/mnist/lenet_train_test.prototxt
    I0408 01:33:10.523326    12 cpu_info.cpp:453] Processor speed [MHz]: 0
    I0408 01:33:10.523360    12 cpu_info.cpp:456] Total number of sockets: 8
    I0408 01:33:10.523373    12 cpu_info.cpp:459] Total number of CPU cores: 8
    I0408 01:33:10.523385    12 cpu_info.cpp:462] Total number of processors: 8
    I0408 01:33:10.523396    12 cpu_info.cpp:465] GPU is used: no
    I0408 01:33:10.523406    12 cpu_info.cpp:468] OpenMP environmental variables are specified: no
    I0408 01:33:10.523427    12 cpu_info.cpp:471] OpenMP thread bind allowed: yes
    I0408 01:33:10.523437    12 cpu_info.cpp:474] Number of OpenMP threads: 8
    I0408 01:33:10.524194    12 net.cpp:1052] The NetState phase (0) differed from the phase (1) specified by a rule in layer mnist
    I0408 01:33:10.524220    12 net.cpp:1052] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
    I0408 01:33:10.524510    12 net.cpp:207] Initializing net from parameters:
    I0408 01:33:10.524531    12 net.cpp:208]
    name: "LeNet"
    state {
      phase: TRAIN
      level: 0
      stage: ""
    }
    engine: "MKLDNN"
    compile_net_state {
      bn_scale_remove: false
      bn_scale_merge: false
    }
    layer {
      name: "mnist"
      type: "Data"
      top: "data"
      top: "label"
      include {
        phase: TRAIN
      }
      transform_param {
        scale: 0.00390625
      }
      data_param {
        source: "examples/mnist/mnist_train_lmdb"
        batch_size: 64
        backend: LMDB
      }
    }
    layer {
      name: "conv1"
      type: "Convolution"
      bottom: "data"
      top: "conv1"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      convolution_param {
        num_output: 20
        kernel_size: 5
        stride: 1
        weight_filler {
          type: "xavier"
        }
        bias_filler {
          type: "constant"
        }
      }
    }
    layer {
      name: "pool1"
      type: "Pooling"
      bottom: "conv1"
      top: "pool1"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "conv2"
      type: "Convolution"
      bottom: "pool1"
      top: "conv2"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      convolution_param {
        num_output: 50
        kernel_size: 5
        stride: 1
        weight_filler {
          type: "xavier"
        }
        bias_filler {
          type: "constant"
        }
      }
    }
    layer {
      name: "pool2"
      type: "Pooling"
      bottom: "conv2"
      top: "pool2"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "ip1"
      type: "InnerProduct"
      bottom: "pool2"
      top: "ip1"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      inner_product_param {
        num_output: 500
        weight_filler {
          type: "xavier"
        }
        bias_filler {
          type: "constant"
        }
      }
    }
    layer {
      name: "relu1"
      type: "ReLU"
      bottom: "ip1"
      top: "ip1"
    }
    layer {
      name: "ip2"
      type: "InnerProduct"
      bottom: "ip1"
      top: "ip2"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      inner_product_param {
        num_output: 10
        weight_filler {
          type: "xavier"
        }
        bias_filler {
          type: "constant"
        }
      }
    }
    layer {
      name: "loss"
      type: "SoftmaxWithLoss"
      bottom: "ip2"
      bottom: "label"
      top: "loss"
    }
    …………………………………………….
    …………………………………………….
    …………………………………………….
    
    I0408 01:36:28.103435    12 solver.cpp:312] Iteration 7300, loss = 0.0219446
    I0408 01:36:28.103497    12 solver.cpp:333]     Train net output #0: loss = 0.0219446 (* 1 = 0.0219446 loss)
    I0408 01:36:28.103519    12 sgd_solver.cpp:215] Iteration 7300, lr = 0.00662927
    I0408 01:36:30.492499    12 solver.cpp:312] Iteration 7400, loss = 0.00484636
    I0408 01:36:30.492563    12 solver.cpp:333]     Train net output #0: loss = 0.00484634 (* 1 = 0.00484634 loss)
    I0408 01:36:30.492584    12 sgd_solver.cpp:215] Iteration 7400, lr = 0.00660067
    I0408 01:36:32.912159    12 solver.cpp:474] Iteration 7500, Testing net (#0)
    I0408 01:36:33.992708    12 solver.cpp:563]     Test net output #0: accuracy = 0.9905
    I0408 01:36:33.992983    12 solver.cpp:563]     Test net output #1: loss = 0.0301301 (* 1 = 0.0301301 loss)
    I0408 01:36:34.019621    12 solver.cpp:312] Iteration 7500, loss = 0.00250706
    I0408 01:36:34.019706    12 solver.cpp:333]     Train net output #0: loss = 0.00250702 (* 1 = 0.00250702 loss)
    I0408 01:36:34.020164    12 sgd_solver.cpp:215] Iteration 7500, lr = 0.00657236
    I0408 01:36:36.432328    12 solver.cpp:312] Iteration 7600, loss = 0.00537509
    I0408 01:36:36.432528    12 solver.cpp:333]     Train net output #0: loss = 0.00537505 (* 1 = 0.00537505 loss)
    I0408 01:36:36.432566    12 sgd_solver.cpp:215] Iteration 7600, lr = 0.00654433
    I0408 01:36:39.159704    12 solver.cpp:312] Iteration 7700, loss = 0.034624
    I0408 01:36:39.159781    12 solver.cpp:333]     Train net output #0: loss = 0.0346239 (* 1 = 0.0346239 loss)
    I0408 01:36:39.159811    12 sgd_solver.cpp:215] Iteration 7700, lr = 0.00651658
    I0408 01:36:41.873411    12 solver.cpp:312] Iteration 7800, loss = 0.00424178
    I0408 01:36:41.873672    12 solver.cpp:333]     Train net output #0: loss = 0.00424175 (* 1 = 0.00424175 loss)
    I0408 01:36:41.873694    12 sgd_solver.cpp:215] Iteration 7800, lr = 0.00648911
    I0408 01:36:44.552800    12 solver.cpp:312] Iteration 7900, loss = 0.00208136
    I0408 01:36:44.553073    12 solver.cpp:333]     Train net output #0: loss = 0.00208134 (* 1 = 0.00208134 loss)
    I0408 01:36:44.553095    12 sgd_solver.cpp:215] Iteration 7900, lr = 0.0064619
    I0408 01:36:47.132925    12 solver.cpp:474] Iteration 8000, Testing net (#0)
    I0408 01:36:48.254405    12 solver.cpp:563]     Test net output #0: accuracy = 0.9905
    I0408 01:36:48.254935    12 solver.cpp:563]     Test net output #1: loss = 0.0278543 (* 1 = 0.0278543 loss)
    I0408 01:36:48.279563    12 solver.cpp:312] Iteration 8000, loss = 0.0065576
    I0408 01:36:48.279626    12 solver.cpp:333]     Train net output #0: loss = 0.00655758 (* 1 = 0.00655758 loss)
    I0408 01:36:48.279647    12 sgd_solver.cpp:215] Iteration 8000, lr = 0.00643496
    I0408 01:36:50.693308    12 solver.cpp:312] Iteration 8100, loss = 0.0102435
    I0408 01:36:50.694417    12 solver.cpp:333]     Train net output #0: loss = 0.0102435 (* 1 = 0.0102435 loss)
    I0408 01:36:50.694447    12 sgd_solver.cpp:215] Iteration 8100, lr = 0.00640827
    I0408 01:36:53.059345    12 solver.cpp:312] Iteration 8200, loss = 0.0111062
    I0408 01:36:53.059619    12 solver.cpp:333]     Train net output #0: loss = 0.0111061 (* 1 = 0.0111061 loss)
    I0408 01:36:53.059643    12 sgd_solver.cpp:215] Iteration 8200, lr = 0.00638185
    I0408 01:36:55.439267    12 solver.cpp:312] Iteration 8300, loss = 0.0255548
    I0408 01:36:55.439332    12 solver.cpp:333]     Train net output #0: loss = 0.0255548 (* 1 = 0.0255548 loss)
    I0408 01:36:55.439357    12 sgd_solver.cpp:215] Iteration 8300, lr = 0.00635567
    I0408 01:36:57.821687    12 solver.cpp:312] Iteration 8400, loss = 0.00810484
    I0408 01:36:57.821768    12 solver.cpp:333]     Train net output #0: loss = 0.00810483 (* 1 = 0.00810483 loss)
    I0408 01:36:57.821794    12 sgd_solver.cpp:215] Iteration 8400, lr = 0.00632975
    I0408 01:37:00.229344    12 solver.cpp:474] Iteration 8500, Testing net (#0)
    I0408 01:37:01.341504    12 solver.cpp:563]     Test net output #0: accuracy = 0.991
    I0408 01:37:01.341583    12 solver.cpp:563]     Test net output #1: loss = 0.028333 (* 1 = 0.028333 loss)
    I0408 01:37:01.368783    12 solver.cpp:312] Iteration 8500, loss = 0.00672253
    I0408 01:37:01.368850    12 solver.cpp:333]     Train net output #0: loss = 0.00672251 (* 1 = 0.00672251 loss)
    I0408 01:37:01.368876    12 sgd_solver.cpp:215] Iteration 8500, lr = 0.00630407
    I0408 01:37:03.789499    12 solver.cpp:312] Iteration 8600, loss = 0.000701985
    I0408 01:37:03.789630    12 solver.cpp:333]     Train net output #0: loss = 0.000701961 (* 1 = 0.000701961 loss)
    I0408 01:37:03.789660    12 sgd_solver.cpp:215] Iteration 8600, lr = 0.00627864
    I0408 01:37:06.311506    12 solver.cpp:312] Iteration 8700, loss = 0.00329251
    I0408 01:37:06.311738    12 solver.cpp:333]     Train net output #0: loss = 0.00329248 (* 1 = 0.00329248 loss)
    I0408 01:37:06.311763    12 sgd_solver.cpp:215] Iteration 8700, lr = 0.00625344
    I0408 01:37:08.734477    12 solver.cpp:312] Iteration 8800, loss = 0.0011685
    I0408 01:37:08.734781    12 solver.cpp:333]     Train net output #0: loss = 0.00116848 (* 1 = 0.00116848 loss)
    I0408 01:37:08.734805    12 sgd_solver.cpp:215] Iteration 8800, lr = 0.00622847
    I0408 01:37:11.223204    12 solver.cpp:312] Iteration 8900, loss = 0.000881624
    I0408 01:37:11.223266    12 solver.cpp:333]     Train net output #0: loss = 0.000881607 (* 1 = 0.000881607 loss)
    I0408 01:37:11.223289    12 sgd_solver.cpp:215] Iteration 8900, lr = 0.00620374
    I0408 01:37:13.565495    12 solver.cpp:474] Iteration 9000, Testing net (#0)
    I0408 01:37:14.642087    12 solver.cpp:563]     Test net output #0: accuracy = 0.99
    I0408 01:37:14.642159    12 solver.cpp:563]     Test net output #1: loss = 0.0268256 (* 1 = 0.0268256 loss)
    I0408 01:37:14.666667    12 solver.cpp:312] Iteration 9000, loss = 0.011516
    I0408 01:37:14.666734    12 solver.cpp:333]     Train net output #0: loss = 0.011516 (* 1 = 0.011516 loss)
    I0408 01:37:14.666755    12 sgd_solver.cpp:215] Iteration 9000, lr = 0.00617924
    I0408 01:37:17.068984    12 solver.cpp:312] Iteration 9100, loss = 0.00914626
    I0408 01:37:17.069262    12 solver.cpp:333]     Train net output #0: loss = 0.00914625 (* 1 = 0.00914625 loss)
    I0408 01:37:17.069284    12 sgd_solver.cpp:215] Iteration 9100, lr = 0.00615496
    I0408 01:37:19.455351    12 solver.cpp:312] Iteration 9200, loss = 0.00317596
    I0408 01:37:19.455596    12 solver.cpp:333]     Train net output #0: loss = 0.00317595 (* 1 = 0.00317595 loss)
    I0408 01:37:19.455623    12 sgd_solver.cpp:215] Iteration 9200, lr = 0.0061309
    I0408 01:37:21.834389    12 solver.cpp:312] Iteration 9300, loss = 0.00890829
    I0408 01:37:21.835710    12 solver.cpp:333]     Train net output #0: loss = 0.00890827 (* 1 = 0.00890827 loss)
    I0408 01:37:21.835734    12 sgd_solver.cpp:215] Iteration 9300, lr = 0.00610706
    I0408 01:37:24.199872    12 solver.cpp:312] Iteration 9400, loss = 0.0232409
    I0408 01:37:24.199946    12 solver.cpp:333]     Train net output #0: loss = 0.0232409 (* 1 = 0.0232409 loss)
    I0408 01:37:24.199970    12 sgd_solver.cpp:215] Iteration 9400, lr = 0.00608343
    I0408 01:37:26.601363    12 solver.cpp:474] Iteration 9500, Testing net (#0)
    I0408 01:37:27.673274    12 solver.cpp:563]     Test net output #0: accuracy = 0.989
    I0408 01:37:27.673359    12 solver.cpp:563]     Test net output #1: loss = 0.0323742 (* 1 = 0.0323742 loss)
    I0408 01:37:27.698536    12 solver.cpp:312] Iteration 9500, loss = 0.00388906
    I0408 01:37:27.698603    12 solver.cpp:333]     Train net output #0: loss = 0.00388905 (* 1 = 0.00388905 loss)
    I0408 01:37:27.698628    12 sgd_solver.cpp:215] Iteration 9500, lr = 0.00606002
    I0408 01:37:30.146077    12 solver.cpp:312] Iteration 9600, loss = 0.00205984
    I0408 01:37:30.146361    12 solver.cpp:333]     Train net output #0: loss = 0.00205983 (* 1 = 0.00205983 loss)
    I0408 01:37:30.146386    12 sgd_solver.cpp:215] Iteration 9600, lr = 0.00603682
    I0408 01:37:32.567978    12 solver.cpp:312] Iteration 9700, loss = 0.00330913
    I0408 01:37:32.568212    12 solver.cpp:333]     Train net output #0: loss = 0.00330913 (* 1 = 0.00330913 loss)
    I0408 01:37:32.568235    12 sgd_solver.cpp:215] Iteration 9700, lr = 0.00601382
    I0408 01:37:34.955097    12 solver.cpp:312] Iteration 9800, loss = 0.0134696
    I0408 01:37:34.955363    12 solver.cpp:333]     Train net output #0: loss = 0.0134696 (* 1 = 0.0134696 loss)
    I0408 01:37:34.955386    12 sgd_solver.cpp:215] Iteration 9800, lr = 0.00599102
    I0408 01:37:37.377465    12 solver.cpp:312] Iteration 9900, loss = 0.00235391
    I0408 01:37:37.377655    12 solver.cpp:333]     Train net output #0: loss = 0.0023539 (* 1 = 0.0023539 loss)
    I0408 01:37:37.377678    12 sgd_solver.cpp:215] Iteration 9900, lr = 0.00596843
    I0408 01:37:39.850847    12 solver.cpp:707] Snapshot begin
    I0408 01:37:39.859346    12 solver.cpp:769] Snapshotting to binary proto file examples/mnist/lenet_iter_10000.caffemodel
    I0408 01:37:39.869576    12 sgd_solver.cpp:754] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_10000.solverstate
    I0408 01:37:39.878753    12 solver.cpp:734] Snapshot end
    I0408 01:37:39.888120    12 solver.cpp:436] Iteration 10000, loss = 0.00251002
    I0408 01:37:39.888172    12 solver.cpp:474] Iteration 10000, Testing net (#0)
    I0408 01:37:41.067348    12 solver.cpp:563]     Test net output #0: accuracy = 0.9913
    I0408 01:37:41.067407    12 solver.cpp:563]     Test net output #1: loss = 0.0267652 (* 1 = 0.0267652 loss)
    I0408 01:37:41.067422    12 solver.cpp:443] Optimization Done.
    I0408 01:37:41.067432    12 caffe.cpp:345] Optimization Done.

    花费时间 01:37:41.067432 - 01:33:10.509523 = 4.31分钟

    CPU及IO利用率:

    8个CPU基本达到100%

    IO很小,MNIST数据集只有几十M,数据都被cache了

    8. 加上MKL2017

    ./examples/mnist/train_lenet.sh -engine "MKL2017"

     

    第一次: 03:01:35.904659 -02:58:30.774215 = 2:55分钟

    第二次: 03:05:15.134409 - 03:02:13.449990 = 2:58分钟

    对于原生Caffe

    docker run -ti bvlc/caffe:cpu caffe –version
    
    sudo docker run -v "/home/ubuntu/yuntong/:/opt/caffe/share"  -it bvlc/caffe:cpu /bin/bash
    ./examples/mnist/train_lenet.sh  

    运行时间24分钟。

     

    原生caffe 只能在一个线程上跑

    运行cifar10

    该数据集共有60000张彩色图像,这些图像是32*32,分为10个类,每类6000张图。这里面有50000张用于训练,构成了5个训练批,每一批10000张图;另外10000用于测试,单独构成一批。测试批的数据里,取自10类中的每一类,每一类随机取1000张。抽剩下的就随机排列组成了训练批。注意一个训练批中的各类图像并不一定数量相同,总的来看训练批,每一类都有5000张图。

    下面这幅图就是列举了10各类,每一类展示了随机的10张图片:

    1. 下载和创建cifar10数据集:  

    cd $CAFFE_ROOT
    ./data/mnist/get_cifar10.sh
    ./examples/cifar10/create_cifar10.sh
    Creating lmdb...

    2. 写入intel caffe docker中的caffe运行路径: ~/yuntong/caffe-1.0/examples$ vim cifar10/train_quick.sh

    TOOLS=/opt/caffe/build/tools
    

    3. 设置CPU模式:

    vim cifar10_quick_solver_lr1.prototxt  cifar10_quick_solver.prototxt
    

    4. Intel Caffe里面运行:

    sudo docker run -v "/home/ubuntu/yuntong/:/opt/caffe/share"  -it bvlc/caffe:intel /bin/bash
    cd /opt/caffe/share/caffe-1.0
    
    ./examples/cifar10/train_quick.sh 
    

     

    训练时间: 07:20:47.795905  - 07:08:08.193487 = 12分40

    5. 原生 Caffe里面运行:

    sudo docker run -v "/home/ubuntu/yuntong/:/opt/caffe/share"  -it bvlc/caffe:cpu /bin/bash
    cd /opt/caffe/share/caffe-1.0
    
    ./examples/cifar10/train_quick.sh 
    
    07:26:23.522944
    ……………
    I0408 08:56:53.116524    18 solver.cpp:310] Iteration 5000, loss = 0.449847
    I0408 08:56:53.117141    18 solver.cpp:330] Iteration 5000, Testing net (#0)
    I0408 08:57:30.968313    21 data_layer.cpp:73] Restarting data prefetching from start.
    I0408 08:57:32.527096    18 solver.cpp:397]     Test net output #0: accuracy = 0.7561
    I0408 08:57:32.527354    18 solver.cpp:397]     Test net output #1: loss = 0.72683 (* 1 = 0.72683 loss)
    I0408 08:57:32.527364    18 solver.cpp:315] Optimization Done.
    I0408 08:57:32.527381    18 caffe.cpp:259] Optimization Done.
    

    用时 1.5小时

    机器配置:

    CPU: 8 * Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz
    
    Memory: 16G
    
    Storage: SATA SSD
  • 相关阅读:
    164 Maximum Gap 最大间距
    162 Find Peak Element 寻找峰值
    160 Intersection of Two Linked Lists 相交链表
    155 Min Stack 最小栈
    154 Find Minimum in Rotated Sorted Array II
    153 Find Minimum in Rotated Sorted Array 旋转数组的最小值
    152 Maximum Product Subarray 乘积最大子序列
    151 Reverse Words in a String 翻转字符串里的单词
    bzoj3994: [SDOI2015]约数个数和
    bzoj 4590: [Shoi2015]自动刷题机
  • 原文地址:https://www.cnblogs.com/allcloud/p/8743169.html
Copyright © 2011-2022 走看看