zoukankan      html  css  js  c++  java
  • Ubuntu16.04 + gtx1060 + cuda8.0 + cudnn5.1 + caffe + Theano + Tensorflow

    参考 ubuntu16.04+gtx1060+cuda8.0+caffe安装、测试经历 ,细节处有差异。

    首先说明,这是在台式机上的安装测试经历,首先安装的win10,然后安装ubuntu16.04双系统,显卡为GTX1060 
    台式机显示器接的是GTX1060 HDMI口,win10上首先安装了最新的GTX1060驱动375


    废话不多说,上车吧,少年

    一、首先安装nvidia显卡驱动

    1. 我是1080P的显示器,在没有安装显卡驱动前,ubuntu分辨率很低,可以手动修改一下grub文件,提高分辨率,在终端输入

      sudo vim /etc/default/grub
      找到以下行 
      # The resolution used on graphical terminal
      # note that you can use only modes which your graphic card supports via VBE
      # you can see them in real GRUB with the command 'vbeinfo'
      # GRUB_GFXMODE=640×480
      按a进入插入模式,增加下面一行 
      GRUB_GFXMODE=1920×1080 #这里分辨率自行设置 
      按esc退出插入模式,按:wq保存退出 
      在终端编辑 
      sudo update-grub
      更新grub 
      重新启动ubuntu使之生效 

    2. 进入ubuntu系统设置-软件与更新-附加驱动


      安装之后重启系统让GTX1060显卡驱动生效 

    3. 测试

      终端输入 
      nvidia-smi 
      显示效果如下图表示安装成功 



    二、cuda安装

    1. 下载cuda_8.0.61_375.26_linux.run 和 cudnn-8.0-linux-x64-v5.1.tgz

      这里我提供了百度网盘,这两个文件我先在win10下下载好,并用u盘拷贝到ubuntu的下载目录下 

    2. 安装cuda8.0

      终端输入 
      cd 下载/ 
      sh cuda_8.0.27_linux.run --override 
      启动安装程序,一直按空格到最后,输入accept接受条款 (或者按 Q)
      输入n不安装nvidia图像驱动,之前已经安装过了 
      输入y安装cuda 8.0工具 
      回车确认cuda默认安装路径:/usr/local/cuda-8.0 
      输入y用sudo权限运行安装,输入密码 
      输入y或者n安装或者不安装指向/usr/local/cuda的符号链接 
      输入y安装CUDA 8.0 Samples,以便后面测试 
      回车确认CUDA 8.0 Samples默认安装路径:/home/yt(yt是我的用户名),该安装路径测试完可以删除 

    3. 安装cudnn v5.1

      终端输入 

      cd 下载/ 
      tar zxvf cudnn-8.0-linux-x64-v5.1.tgz

      解压在下载目录下产生一个cuda目录 

      cd cuda
      sudo cp lib64/* /usr/local/cuda/lib64/         #复制头文件
      sudo cp include/cudnn.h /usr/local/cuda/include/  #复制动态链接库
      
      sudo chmod a+r /usr/local/cuda/include/cudnn.h
      sudo chmod a+r /usr/local/cuda/lib64
      sudo chmod a+r /usr/local/cuda/lib64/libcudnn*   #给所有用户增加这些文件的读权限  
    4. 建立软链接

      终端输入 

      cd /usr/local/cuda/lib64/ 
      sudo rm -rf libcudnn.so libcudnn.so.5 
      sudo ln -s libcudnn.so.5.1.10 libcudnn.so.5   #具体看版本
      sudo ln -s libcudnn.so.5 libcudnn.so 

      设置环境变量,终端输入 

      sudo gedit /etc/profile 

      在末尾加入 

      PATH=/usr/local/cuda/bin:$PATH 
      export PATH 

      保存后,创建链接文件 

      sudo vim /etc/ld.so.conf.d/cuda.conf 

      按a进入插入模式,增加下面一行 

      /usr/local/cuda/lib64 

      按esc退出插入模式,按:wq保存退出 
      最后在终端输入sudo ldconfig使链接生效 

    5. cuda Samples测试

      打开CUDA 8.0 Samples默认安装路径,终端输入 
      cd /home/yt/NVIDIA_CUDA-8.0_Samples (yt是我的用户名) 
      sudo make all -j4 (4核) 
      出现“unsupported GNU version! gcc versions later than 5.3 are not supported!”的错误,这是由于GCC版本过高,在终端输入 
      cd /usr/local/cuda-8.0/include 
      sudo cp host_config.h host_config.h.bak 
      sudo gedit host_config.h 
      ctrl+f寻找有“5.3”的地方,只有一处,如下 
      # if __GNUC__ > 5 || (__GNUC__ == 5 && __GNUC_MINOR__ > 3) 
      #error -- unsupported GNU version! gcc versions later than 5.3 are not supported! 
      将两个5改成6,即 
      #if __GNUC__ > 6 || (__GNUC__ == 6 && __GNUC_MINOR__ > 3) 
      保存退出,继续在终端输入 
      cd /home/yt/NVIDIA_CUDA-8.0_Samples (yt是我的用户名) 
      sudo make all -j4 (4核) 
      完成后继续向终端输入 
      cd bin/x86_64/linux/release 
      ./deviceQuery 
      完成之后出现如下图所示,表示成功安装cuda 



    三、依赖包安装

    1. sudo apt-get install build-essential #必要的编译工具依赖

    2. sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler

    3. sudo apt-get install --no-install-recommends libboost-all-dev

    4. sudo apt-get install libatlas-base-dev

    5. sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev



    1. 安装python的pip和easy_install,方便安装软件包

      终端输入 
      cd 
      wget --no-check-certificate https://bootstrap.pypa.io/ez_setup.py 
      sudo python ez_setup.py --insecure 
      wget https://bootstrap.pypa.io/get-pip.py 
      sudo python get-pip.py



    1. 安装科学计算和python所需的部分库

      终端输入 
      sudo apt-get install libblas-dev liblapack-dev libatlas-base-dev gfortran python-numpy



    1. 安装git,拉取源码

      终端输入 
      sudo apt-get install git 
      git clone https://github.com/BVLC/caffe.git



    1. 安装python依赖

      终端输入 
      sudo apt-get install python-pip 安装pip 
      cd /home/yt/caffe/python
      sudo su 
      for req in $(cat "requirements.txt"); do pip install -i https://pypi.tuna.tsinghua.edu.cn/simple $req; done
      按Ctrl+D退出sudo su模式



    八、编译caffe(暂不对matlab说明)

    1. 终端输入 
      cd /home/yt/caffe 
      cp Makefile.config.example Makefile.config 
      gedit Makefile.config

      ①将USE_CUDNN := 1取消注释,

      INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include后面打上一个空格 然后添加/usr/include/hdf5/serial如果没有这一句可能会报一个找不到hdf5.h的错误

    2. 终端输入 
      make all -j4 
      make过程中出现找不到lhdf5_hl和lhdf5的错误, 
      解决方案: 
      在计算机中搜索libhdf5_serial.so.10.1.0,找到后右键点击打开项目位置 
      该目录下空白处右键点击在终端打开,打开新终端输入 
      sudo ln libhdf5_serial.so.10.1.0 libhdf5.so 
      sudo ln libhdf5_serial_hl.so.10.0.2 libhdf5_hl.so 
      最后在终端输入sudo ldconfig使链接生效 
      原终端中输入make clean清除第一次编译结果 
      再次输入make all -j4重新编译

    3. 终端输入 

      make test -j4 
      make runtest -j4 
      make pycaffe -j4 
      make distribute   #生成发布安装包
    4. 测试python,终端输入 
      pip install protobuf  -i https://pypi.tuna.tsinghua.edu.cn/simple pyspider
      cd /home/yt/caffe/python 
      python 
      import caffe 
      如果不报错就说明编译成功



    九、mnist测试

      1. 下载mnist数据集,终端输入 
        cd /home/yt/caffe/data/mnist/
        ./get_mnist.sh 获取mnist数据集 
        /home/yt/caffe/data/mnist/目录下会多出训练集图片、训练集标签、测试集图片和测试集标签等4个文件

      2. mnist数据格式转换,终端输入 
        cd /home/yt/caffe/
        ./examples/mnist/create_mnist.sh
        必须要在第一行之后运行第二行,即必须要在caffe根目录下运行create_mnist.sh 
        此时在/caffe/examples/mnist/目录下生成mnist_test_lmdb和mnist_train_lmdb两个LMDB格式的训练集和测试集

      3. LeNet-5模型描述在/caffe/examples/mnist/lenet_train_test.prototxt

      4. Solver配置文件在/caffe/examples/mnist/lenet_solver.prototxt

      5. 训练mnist,执行文件在/caffe/examples/mnist/train_lenet.sh
        终端输入 
        cd /home/yt/caffe/
        ./examples/mnist/train_lenet.sh
        测试结果如下 

    十、安装theano

    1、直接输入命令:

    sudo pip install theano 

    2、配置参数文件:.theanorc

    sudo gedit ~/.theanorc
    [global]  
    floatX=float32  
    device=gpu  
    base_compiledir=~/external/.theano/  
    allow_gc=False  
    warn_float64=warn  
    [mode]=FAST_RUN  
      
    [nvcc]  
    fastmath=True  
      
    [cuda]  
    root=/usr/local/cuda 

    3、运行测试例子:

    from theano import function, config, shared, sandbox  
    import theano.tensor as T  
    import numpy  
    import time  
      
    vlen = 10 * 30 * 768  # 10 x #cores x # threads per core  
    iters = 1000  
      
    rng = numpy.random.RandomState(22)  
    x = shared(numpy.asarray(rng.rand(vlen), config.floatX))  
    f = function([], T.exp(x))  
    print(f.maker.fgraph.toposort())  
    t0 = time.time()  
    for i in range(iters):  
        r = f()  
    t1 = time.time()  
    print("Looping %d times took %f seconds" % (iters, t1 - t0))  
    print("Result is %s" % (r,))  
    if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):  
        print('Used the cpu')  
    else:  
        print('Used the gpu') 

    十、安装tensosrflow

    1、安装编译工具Bazel

    echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list 
    curl https://storage.googleapis.com/bazel-apt/doc/apt-key.pub.gpg | sudo apt-key add -  
    sudo apt-get update && sudo apt-get install bazel  
    sudo apt-get upgrade bazel

    2、下载tensorflow并编译

    git clone https://github.com/tensorflow/tensorflow
    cd tensorflow
    git checkout Branch # where Branch is the desired branch
    git checkout r1.0
    sudo apt-get install python-numpy python-dev python-pip python-wheel
    sudo apt-get install python3-numpy python3-dev python3-pip python3-wheel
    sudo apt-get install libcupti-dev 
    ./configure  
    $ ./configure # 以下是一个例子
    Please specify the location of python. [Default is /usr/bin/python]: y
    Invalid python path. y cannot be found
    Please specify the location of python. [Default is /usr/bin/python]: 
    Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 
    Do you wish to use jemalloc as the malloc implementation? [Y/n] y
    jemalloc enabled
    Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n
    No Google Cloud Platform support will be enabled for TensorFlow
    Do you wish to build TensorFlow with Hadoop File System support? [y/N] y
    Hadoop File System support will be enabled for TensorFlow
    Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] n
    No XLA JIT support will be enabled for TensorFlow
    Found possible Python library paths:
      /usr/local/lib/python2.7/dist-packages
      /usr/lib/python2.7/dist-packages
    Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-packages]
    
    Using python library path: /usr/local/lib/python2.7/dist-packages
    Do you wish to build TensorFlow with OpenCL support? [y/N] n
    No OpenCL support will be enabled for TensorFlow
    Do you wish to build TensorFlow with CUDA support? [y/N] y
    CUDA support will be enabled for TensorFlow
    Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
    Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0
    Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
    Please specify the Cudnn version you want to use. [Leave empty to use system default]: 5
    Please specify the location where cuDNN 5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
    Please specify a list of comma-separated Cuda compute capabilities you want to build with.
    You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
    Please note that each additional compute capability significantly increases your build time and binary size.
    [Default is: "3.5,5.2"]: 6.1
    INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes.
    ........
    INFO: All external dependencies fetched successfully.
    Configuration finished

     编译

    bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
    bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg 

    检查tmp文件夹下生成的whl文件名

    sudo pip install /tmp/tensorflow_pkg/ tensorflow-1.0.1-cp27-cp27mu-linux_x86_64.whl

    3、测试

    python
    import tensorflow as tf
    sess = tf.Session()

  • 相关阅读:
    [Link]TCPDF组件
    MySql排名查询
    PHPUnit安装
    [Link]使用HAProxy、PHP、Redis和MySQL支撑10亿请求每周架构细节
    同比和环比的计算
    jquery fullPage
    [转]php socket编程通信
    [转]抢先Mark!微信公众平台开发进阶篇资源集锦
    不越狱安装破解软件,iResign重签名方法
    【转】搜狗开源内部项目管理平台Cynthia意欲何为
  • 原文地址:https://www.cnblogs.com/xuanyuyt/p/6769394.html
Copyright © 2011-2022 走看看