zoukankan      html  css  js  c++  java
  • 编译分布式并行版caffe(Open MPI)教程

    caffe版本:https://github.com/yjxiong/caffe

    使用环境:

    1 CentOS release 6.6 (Final)
    2 CUDA8.0
    3 CuDNN6.0
    4 Open MPI 3.1.3
    5 OpenCV 3.1.0
    View Code

    CUDA8.0、CuDNN6.0、OpenCV3.1.0以及其他caffe所需要的依赖已经装好,这里仅需要安装OpenMPI3.1.3,步骤如下:

    OpenMPI-3.1.3安装

    1. 解压openmpi-3.1.3,进入解压后的文件夹 — openmpi3.1.3,在终端输入如下命令:

    1 ./configure --prefix=/storage/student5/usr/local/openmpi --with-cuda --enable-mpi-thread-multiple
    2 #--prefix后的路径未openmpi的安装路径;
    3 sudo make all install
    4 # make all install 加sudo,否则安装过程中可能出问题
    View Code

    2. 测试安装是否成功

    1 cd openmpi-3.1.3/examples
    2 make
    3 mpirun -np 4 hello_c
    View Code

    Caffe安装

    1. 下载caffe,将Makefile.config.example另存为Makefile.config,将其修改成以下的样子:

     1 ## Refer to http://caffe.berkeleyvision.org/installation.html
     2 # Contributions simplifying and improving our build system are welcome!
     3 
     4 # cuDNN acceleration switch (uncomment to build with cuDNN).
     5  USE_CUDNN := 1
     6 
     7 # CPU-only switch (uncomment to build without GPU support).
     8 # CPU_ONLY := 1
     9 
    10 # uncomment to disable IO dependencies and corresponding data layers
    11  USE_OPENCV := 1
    12  USE_LEVELDB := 1
    13  USE_LMDB := 1
    14 
    15 # Uncomment if you're using OpenCV 3
    16  OPENCV_VERSION := 3
    17 
    18 # To customize your choice of compiler, uncomment and set the following.
    19 # N.B. the default for Linux is g++ and the default for OSX is clang++
    20 # CUSTOM_CXX := g++
    21 
    22 # CUDA directory contains bin/ and lib/ directories that we need.
    23 CUDA_DIR := /usr/local/cuda
    24 # On Ubuntu 14.04, if cuda tools are installed via
    25 # "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
    26 # CUDA_DIR := /usr
    27 
    28 # CUDA architecture setting: going with all of them.
    29 # For CUDA < 6.0, comment the *_50 lines for compatibility.
    30 CUDA_ARCH :=     -gencode arch=compute_30,code=sm_30 
    31         -gencode arch=compute_35,code=sm_35 
    32         -gencode arch=compute_50,code=sm_50 
    33         -gencode arch=compute_50,code=compute_50
    34 
    35 # BLAS choice:
    36 # atlas for ATLAS (default)
    37 # mkl for MKL
    38 # open for OpenBlas
    39 BLAS := atlas
    40 # Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
    41 # Leave commented to accept the defaults for your choice of BLAS
    42 # (which should work)!
    43  BLAS_INCLUDE := /usr/include
    44  BLAS_LIB := /usr/lib64/atlas
    45 
    46 # Homebrew puts openblas in a directory that is not on the standard search path
    47 # BLAS_INCLUDE := $(shell brew --prefix openblas)/include
    48 # BLAS_LIB := $(shell brew --prefix openblas)/lib
    49 
    50 # This is required only if you will compile the matlab interface.
    51 # MATLAB directory should contain the mex binary in /bin.
    52  MATLAB_DIR := /usr/local/MATLAB/R2014a
    53 # MATLAB_DIR := /Applications/MATLAB_R2012b.app
    54 
    55 # NOTE: this is required only if you will compile the python interface.
    56 # We need to be able to find Python.h and numpy/arrayobject.h.
    57 PYTHON_INCLUDE := /usr/include/python2.7 
    58         /usr/lib/python2.7/dist-packages/numpy/core/include
    59 # Anaconda Python distribution is quite popular. Include path:
    60 # Verify anaconda location, sometimes it's in root.
    61 # ANACONDA_HOME := $(HOME)/anaconda
    62 # PYTHON_INCLUDE := $(ANACONDA_HOME)/include 
    63         # $(ANACONDA_HOME)/include/python2.7 
    64         # $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include 
    65 
    66 # We need to be able to find libpythonX.X.so or .dylib.
    67 PYTHON_LIB := /usr/lib
    68 # PYTHON_LIB := $(ANACONDA_HOME)/lib
    69 
    70 # Homebrew installs numpy in a non standard path (keg only)
    71 # PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
    72 # PYTHON_LIB += $(shell brew --prefix numpy)/lib
    73 
    74 # Uncomment to support layers written in Python (will link against Python libs)
    75  WITH_PYTHON_LAYER := 1
    76 
    77 # Whatever else you find you need goes here.
    78 INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
    79 LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
    80 
    81 # If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
    82 # INCLUDE_DIRS += $(shell brew --prefix)/include
    83 # LIBRARY_DIRS += $(shell brew --prefix)/lib
    84 
    85 # Uncomment to use `pkg-config` to specify OpenCV library paths.
    86 # (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
    87 # USE_PKG_CONFIG := 1
    88 
    89 BUILD_DIR := build
    90 DISTRIBUTE_DIR := distribute
    91 
    92 # Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
    93 # DEBUG := 1
    94 
    95 # The ID of the GPU that 'make runtest' will use to run unit tests.
    96 TEST_GPUID := 0
    97 
    98 # enable pretty build (comment to see full commands)
    99 Q ?= @
    View Code

    2. 在caffe目录下执行以下操作:

    1 mkdir build && cd build
    View Code

    3. 编译caffe

      如果要开启matlab接口,先修改caffe根目录下的CMakeList.txt文件line24:

    1 caffe_option(BUILD_matlab "Build Matlab wrapper" OFF IF UNIX OR APPLE)
    View Code

      修改为:

    1 caffe_option(BUILD_matlab "Build Matlab wrapper" ON IF UNIX OR APPLE)
    View Code

      否则在caffe/build路径下直接进行以下操作:

    1 cmake -DUSE_MPI=ON -DMPI_CXX_COMPILER=/path/to/your/openmpi/bin/mpicxx ..
    2 # USE_MPI=ON即表示开启Open MPI
    3 # -DMPI_CXX_COMPILER后的路径一定得是Open MPI的安装路径下的bin中的mpicxx路径,在/usr/bin下也有这个mpicxx,不要错写路径了
    View Code

    4. 安装caffe,在caffe根目录下执行以下操作:

    1 make all -j8
    2 make install
    3 # 我在安装过程中,make all之后就不需要再make install
    4 make runtest
    5 # 同参考教程中一样,有两个test未通过
    View Code

    5. 编译Python接口:

      a. 添加环境变量:

    1 gedit ~/.bashrc
    View Code

      b. 在其中写入:

    1 export PYTHONPATH=$PYTHONPATH:/path/to/your/caffe/python
    View Code

      c. 使环境变量生效:

    1 source ~/.bashrc
    View Code

      d. 在caffe根目录下:

    1 make pycaffe
    2 # 教程中有加sudo,但是我没有加sudo也没有影响
    View Code

      e. 测试Python接口,在终端输入以下命令:

    1 python
    2 import caffe
    3 # 如果无错,则python接口编译成功
    View Code

    出现问题:

    1. 安装caffe过程中,编译caffe时,输入以下命令出错:

    1 cmake -DUSE_MPI=ON -DMPI_CXX_COMPILER=/path/to/your/openmpi/bin/mpicxx ..
    View Code

      问题1:

     1 CMake Warning at /usr/local/opencv-3.1.0/cmake/OpenCVConfig.cmake:166 (message):
     2   Found OpenCV Windows Pack but it has no binaries compatible with your
     3   configuration.
     4 
     5   You should manually point CMake variable OpenCV_DIR to your build of OpenCV
     6   library.
     7 Call Stack (most recent call first):
     8   cmake/Dependencies.cmake:62 (find_package)
     9   CMakeLists.txt:31 (include)
    10 
    11 
    12 CMake Error at cmake/Dependencies.cmake:62 (find_package):
    13   Found package configuration file:
    14 
    15     /usr/local/opencv-3.1.0/cmake/OpenCVConfig.cmake
    16 
    17   but it set OpenCV_FOUND to FALSE so package "OpenCV" is considered to be
    18   NOT FOUND.
    19 Call Stack (most recent call first):
    20   CMakeLists.txt:31 (include)
    21 
    22 
    23 -- Configuring incomplete, errors occurred!
    24 See also "/storage/student5/usr/local/caffe/build/CMakeFiles/CMakeOutput.log".
    25 See also "/storage/student5/usr/local/caffe/build/CMakeFiles/CMakeError.log".
    View Code

      解决方法:

        尝试一:在CMakeList.txt文件中加入set(OpenCV_DIR /path/to/your/OpenCV/build),该法无效;

        尝试二:退回到caffe根目录,然后make clean,暂时加入如下环境变量后重新从mkdir build && cd build开始,该法有效。

    1 export OpenCV_DIR=/path/to/your/opencv/build
    View Code

      问题2:

    1 CMake Error at /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:108 (message):
    2   Could NOT find Atlas (missing: Atlas_LAPACK_LIBRARY)
    3 Call Stack (most recent call first):
    4   /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:315 (_FPHSA_FAILURE_MESSAGE)
    5   cmake/Modules/FindAtlas.cmake:43 (find_package_handle_standard_args)
    6   cmake/Dependencies.cmake:74 (find_package)
    7   CMakeLists.txt:31 (include)
    View Code

      解决方法:

        尝试一:指定Atlas路径,退回到caffe根目录,然后make clean,暂时加入环境变量export Atlas_ROOT_DIR=/your/Atlas/Root,再重新从mkdir build && cd build开始,该法无效;

        尝试二:退回到caffe根目录,然后make clean,重新mkdir build && cd build开始,在终端输入以下命令后继续进行,该法有效。

    1 cmake -DBLAS=open .
    View Code

    2. 当make all -j8时,

      问题1:

    1 /usr/bin/ld: .build_release/examples/cpp_classification/classification.o: undefined reference to symbol '_ZN2cv6imreadERKNS_6StringEi'
    2 /usr/local/lib/libopencv_imgcodecs.so.3.1: error adding symbols: DSO missing from command line
    3 collect2: error: ld returned 1 exit status
    4 make: *** [.build_release/examples/cpp_classification/classification.bin] Error 1
    5 make: *** Waiting for unfinished jobs....
    View Code

      解决方法:由于使用的是opencv-3.x,需要链接libopencv_imgcodercs.so,在Makefile文件中,line172处做如下修改:

    1 LIBRARIES += glog gflags protobuf leveldb snappy 
    2     lmdb boost_system hdf5_hl hdf5 m 
    3     opencv_core opencv_highgui opencv_imgproc
    View Code

      改为:

    1 LIBRARIES += glog gflags protobuf leveldb snappy 
    2     lmdb boost_system hdf5_hl hdf5 m 
    3     opencv_core opencv_highgui opencv_imgproc opencv_imgcodecs
    View Code

      问题2:

    1 nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be 
    2 removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    View Code

      解决方法:删除Makefile.config中的以下语句:

    1 -gencode arch=compute_20,code=sm_20 
    2 -gencode arch=compute_20,code=sm_21 
    View Code

    参考教程:

    1. https://blog.csdn.net/whyerdiku/article/details/78842498 (Python+Matlab接口)

    2. http://www.cnblogs.com/beihaidao/p/6866342.html (Python+Matlab接口)

    3. https://blog.csdn.net/qq_21368481/article/details/81257265?tdsourcetag=s_pctim_aiomsg (Matlab接口)

  • 相关阅读:
    post和get区别
    https
    tcp/ip协议
    webpack与gulp的不同
    什么是webpack
    spring boot 输入参数统一校验
    spring boot++jpa+ mysql +maven
    Intellij IDEA 2018.2.2 SpringBoot热启动 (Maven)
    git 从远程仓克隆到本地新分支
    ASP.NET MVC 自动模型验证
  • 原文地址:https://www.cnblogs.com/mantha/p/10278525.html
Copyright © 2011-2022 走看看