zoukankan      html  css  js  c++  java
  • 同时安装cuda8和cuda9

    转载自:https://blog.csdn.net/lovebyz/article/details/80704800

    为了使用tensorflow目标检测API的所有算法,所以打算升级一下CUDA版本以支持tf-gpu 1.5++,但原本项目都是基于tf-gpu 1.4 的(tf-gpu 1.5以下都只能使用CUDA_8.0),所以保留了cuda-8.0的情况下安装cuda-9.0。


    系统信息:

    byz@ubuntu:~$ nvidia-smi -(在cuda-8时就已经安装好了的驱动,所以下面选择安装驱动时选No



    首先:关闭X. light类型的桌面系统

    $ sudo /etc/init.d/lightdm stop


    一、下载CUDA

    点击下载(https://developer.nvidia.com/cuda-90-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=runfilelocal),选择如图,最好不要使用deb方式,比较容易失败。



    二、新建cuda_9文件夹,将一中下载的文件放进来

    mkdir cuda_9

    三、生成可执行文件

    chmod 777 cuda_9.0.176_384.81_linux-run

    四、执行安装

    ./cuda_9.0.176_384.81_linux-run

    五、依次填写选项

    -----------------
    Do you accept the previously read EULA?
    accept/decline/quit: accept

    Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81?
    (y)es/(n)o/(q)uit: n  <--(不要装驱动

    Install the CUDA 9.0 Toolkit?
    (y)es/(n)o/(q)uit: y

    Enter Toolkit Location
     [ default is /usr/local/cuda-9.0 ]:

    /usr/local/cuda-9.0 is not writable.
    Do you wish to run the installation with 'sudo'?
    (y)es/(n)o: y

    Please enter your password:
    Do you want to install a symbolic link at /usr/local/cuda?
    (y)es/(n)o/(q)uit: n  <--(链接可在完成安装之后软cuda-9.0链接过来

    Install the CUDA 9.0 Samples?
    (y)es/(n)o/(q)uit: y

    Enter CUDA Samples Location
     [ default is /home/byz ]: /home/byz/app/cuda_9   <--(刚才新建的文件夹

    Installing the CUDA Toolkit in /usr/local/cuda-9.0 ...
    Installing the CUDA Samples in /home/byz/app/cuda_9 ...
    Copying samples to /home/byz/app/cuda_9/NVIDIA_CUDA-9.0_Samples now...
    Finished copying samples.

    ===========
    = Summary =
    ===========

    Driver:   Not Selected
    Toolkit:  Installed in /usr/local/cuda-9.0
    Samples:  Installed in /home/byz/app/cuda_9

    Please make sure that
     -   PATH includes /usr/local/cuda-9.0/bin
     -   LD_LIBRARY_PATH includes /usr/local/cuda-9.0/lib64, or, add /usr/local/cuda-9.0/lib64 to /etc/ld.so.conf and run ldconfig as root

    To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.0/bin

    Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.0/doc/pdf for detailed information on setting up CUDA.

    ***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 9.0 functionality to work.
    To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
        sudo <CudaInstaller>.run -silent -driver

    Logfile is /tmp/cuda_install_2684.log

    六、修改配置文件
    vim .bashrc

    export PATH="/usr/local/cuda/bin:$PATH"
    export LD_LIBRARY_PATH="/usr/local/cuda/lib64:$LD_LIBRARY_PATH"
    export LIBRARY_PATH="/usr/local/cuda/lib64:$LIBRARY_PATH

    source .bashrc


    七、从cuda-8.0 切换到cuda-9.0

    rm –rf /usr/local/cuda  
     
    ln -s /usr/local/cuda-9.0 /usr/local/cuda  

    (需要切换到8.0的话换汤不换药)
    八、 进到 cuda9.0的sample. make所有的例子
    byz@ubuntu:~/app/cuda_9/NVIDIA_CUDA-9.0_Samples$ make
    最后可能像这样:

    ...
    cp simpleCUFFT_2d_MGPU ../../bin/x86_64/linux/release
    make[1]: Leaving directory '/home/byz/app/cuda_9/NVIDIA_CUDA-9.0_Samples/7_CUDALibraries/simpleCUFFT_2d_MGPU'
    make[1]: Entering directory '/home/byz/app/cuda_9/NVIDIA_CUDA-9.0_Samples/7_CUDALibraries/MC_EstimatePiP'
    "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o main.o -c src/main.cpp
    "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o piestimator.o -c src/piestimator.cu
    "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o test.o -c src/test.cpp
    "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o MC_EstimatePiP main.o piestimator.o test.o  -lcurand
    mkdir -p ../../bin/x86_64/linux/release
    cp MC_EstimatePiP ../../bin/x86_64/linux/release
    make[1]: Leaving directory '/home/byz/app/cuda_9/NVIDIA_CUDA-9.0_Samples/7_CUDALibraries/MC_EstimatePiP'
    make[1]: Entering directory '/home/byz/app/cuda_9/NVIDIA_CUDA-9.0_Samples/7_CUDALibraries/randomFog'
    "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -Xcompiler -fpermissive -gencode arch=compute_30,code=compute_30 -o randomFog.o -c randomFog.cpp
    "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -Xcompiler -fpermissive -gencode arch=compute_30,code=compute_30 -o rng.o -c rng.cpp
    "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_30,code=compute_30 -o randomFog randomFog.o rng.o  -L/usr/lib/nvidia-387 -lGL -lGLU -lX11 -lglut -lcurand
    mkdir -p ../../bin/x86_64/linux/release
    cp randomFog ../../bin/x86_64/linux/release
    make[1]: Leaving directory '/home/byz/app/cuda_9/NVIDIA_CUDA-9.0_Samples/7_CUDALibraries/randomFog'
    Finished building CUDA samples

    九、测试是否安装成功

    byz@ubuntu:~/app/cuda_9/NVIDIA_CUDA-9.0_Samples$ cd 1_Utilities/deviceQuery

    byz@ubuntu:~/app/cuda_9/NVIDIA_CUDA-9.0_Samples$ ll
    total 132
    drwxrwxr-x 12 byz byz  4096 Jun 15 14:12 ./
    drwxrwxr-x  3 byz byz  4096 Jun 15 12:53 ../
    drwxr-xr-x 50 byz byz  4096 Jun 15 12:53 0_Simple/
    drwxr-xr-x  7 byz byz  4096 Jun 15 12:53 1_Utilities/
    drwxr-xr-x 12 byz byz  4096 Jun 15 12:53 2_Graphics/
    drwxr-xr-x 22 byz byz  4096 Jun 15 12:53 3_Imaging/
    drwxr-xr-x 10 byz byz  4096 Jun 15 12:53 4_Finance/
    drwxr-xr-x 10 byz byz  4096 Jun 15 12:53 5_Simulations/
    drwxr-xr-x 34 byz byz  4096 Jun 15 12:53 6_Advanced/
    drwxr-xr-x 38 byz byz  4096 Jun 15 12:53 7_CUDALibraries/
    drwxrwxr-x  3 byz byz  4096 Jun 15 14:12 bin/
    drwxr-xr-x  6 byz byz  4096 Jun 15 12:53 common/
    -rw-r--r--  1 byz byz 81262 Jun 15 12:53 EULA.txt
    -rw-r--r--  1 byz byz  2652 Jun 15 12:53 Makefile

    byz@ubuntu:~/app/cuda_9/NVIDIA_CUDA-9.0_Samples/1_Utilities/deviceQuery$ ./deviceQuery

    ./deviceQuery Starting...

     CUDA Device Query (Runtime API) version (CUDART static linking)

    Detected 8 CUDA Capable device(s)

    Device 0: "TITAN Xp"
      CUDA Driver Version / Runtime Version          9.1 / 9.0
      CUDA Capability Major/Minor version number:    6.1
      Total amount of global memory:                 12190 MBytes (12782075904 bytes)
      (30) Multiprocessors, (128) CUDA Cores/MP:     3840 CUDA Cores
      GPU Max Clock rate:                            1582 MHz (1.58 GHz)
      Memory Clock rate:                             5705 Mhz
      Memory Bus Width:                              384-bit
      L2 Cache Size:                                 3145728 bytes
      Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
      Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
      Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
      Total amount of constant memory:               65536 bytes
      Total amount of shared memory per block:       49152 bytes
      Total number of registers available per block: 65536
      Warp size:                                     32
      Maximum number of threads per multiprocessor:  2048
      Maximum number of threads per block:           1024
      Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
      Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
      Maximum memory pitch:                          2147483647 bytes
      Texture alignment:                             512 bytes
      Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
      Run time limit on kernels:                     No
      Integrated GPU sharing Host Memory:            No
      Support host page-locked memory mapping:       Yes
      Alignment requirement for Surfaces:            Yes
      Device has ECC support:                        Disabled
      Device supports Unified Addressing (UVA):      Yes
      Supports Cooperative Kernel Launch:            Yes
      Supports MultiDevice Co-op Kernel Launch:      Yes
      Device PCI Domain ID / Bus ID / location ID:   0 / 4 / 0
      Compute Mode:
         < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

    Device 1: "TITAN Xp"
      CUDA Driver Version / Runtime Version          9.1 / 9.0
      CUDA Capability Major/Minor version number:    6.1
      Total amount of global memory:                 12190 MBytes (12782075904 bytes)
      (30) Multiprocessors, (128) CUDA Cores/MP:     3840 CUDA Cores
      GPU Max Clock rate:                            1582 MHz (1.58 GHz)
      Memory Clock rate:                             5705 Mhz
      Memory Bus Width:                              384-bit
      L2 Cache Size:                                 3145728 bytes
      Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
      Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
      Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
      Total amount of constant memory:               65536 bytes
      Total amount of shared memory per block:       49152 bytes
      Total number of registers available per block: 65536
      Warp size:                                     32
      Maximum number of threads per multiprocessor:  2048
      Maximum number of threads per block:           1024
      Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
      Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
      Maximum memory pitch:                          2147483647 bytes
      Texture alignment:                             512 bytes
      Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
      Run time limit on kernels:                     No
      Integrated GPU sharing Host Memory:            No
      Support host page-locked memory mapping:       Yes
      Alignment requirement for Surfaces:            Yes
      Device has ECC support:                        Disabled
      Device supports Unified Addressing (UVA):      Yes
      Supports Cooperative Kernel Launch:            Yes
      Supports MultiDevice Co-op Kernel Launch:      Yes
      Device PCI Domain ID / Bus ID / location ID:   0 / 5 / 0
      Compute Mode:
         < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

    ... (每颗GPU的信息)
    ...
    > Peer access from TITAN Xp (GPU6) -> TITAN Xp (GPU7) : Yes
    > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU0) : No
    > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU1) : No
    > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU2) : No
    > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU3) : No
    > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU4) : Yes
    > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU5) : Yes
    > Peer access from TITAN Xp (GPU7) -> TITAN Xp (GPU6) : Yes

    deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime Version = 9.0, NumDevs = 8
    Result = PASS

    接下来~

    clone一个环境装tf1.8

    conda create --name tf14 --clone py35

    安装tf-gpu-1.8

    (py35) byz@ubuntu:~/app$ pip install tensorflow_gpu-1.8.0-cp35-cp35m-manylinux1_x86_64.whl

    试运行一下,发现cudnn版本不对,报错如下:

    >>> import tensorflow
    Traceback (most recent call last):
      File "/home/byz/anaconda3/envs/py35/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
        from tensorflow.python.pywrap_tensorflow_internal import *
      File "/home/byz/anaconda3/envs/py35/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
        _pywrap_tensorflow_internal = swig_import_helper()
      File "/home/byz/anaconda3/envs/py35/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
        _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
      File "/home/byz/anaconda3/envs/py35/lib/python3.5/imp.py", line 243, in load_module
        return load_dynamic(name, filename, file)
      File "/home/byz/anaconda3/envs/py35/lib/python3.5/imp.py", line 343, in load_dynamic
        return _load(spec)
    ImportError: libcudnn.so.7: cannot open shared object file: No such file or directory

    Failed to load the native TensorFlow runtime.

    See https://www.tensorflow.org/install/install_sources#common_installation_problems

    for some common reasons and solutions.  Include the entire stack trace
    above this error message when asking for help.

    十、安装cudnn-7.1
    下载路径:https://developer.nvidia.com/rdp/cudnn-download

    下的是cuDNN v7.1 Library for Linux这个版本。不要下cuDNN v7.1 Developer Library for Ubuntu16.04 Power8 (Deb)这个版本,因为是给powe8处理器用的,不是amd64。

    解压安装:

    (py35) byz@ubuntu:~/app$ tar -xvf cudnn-9.0-linux-x64-v7.1.tgz
    cuda/include/cudnn.h
    cuda/NVIDIA_SLA_cuDNN_Support.txt
    cuda/lib64/libcudnn.so
    cuda/lib64/libcudnn.so.7 <--(这就是上面缺的文件
    cuda/lib64/libcudnn.so.7.1.4
    cuda/lib64/libcudnn_static.a

    解压后把相应的文件拷贝到对应的CUDA目录下即可

        sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
        sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
        sudo chmod a+r /usr/local/cuda/include/cudnn.h
        sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

    再次运行tf,错误消失

    (py35) byz@ubuntu:~/app$ python
    Python 3.5.5 |Anaconda, Inc.| (default, Mar 12 2018, 23:12:44)
    [GCC 7.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import tensorflow
    /home/byz/anaconda3/envs/py35/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
      from ._conv import register_converters as _register_converters


    所以~ 这就搞定啦

    说在最后(tf1.4迁移新版本问题):


    ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory

    解决:

    byz@ubuntu:/usr/local/cuda/lib64$ sudo ln -s libcudart.so.9.0.176 libcudart.so.8.0

    做个软链接到9.0

     
    ---------------------  
    作者:Raini.闭雨哲  
    来源:CSDN  
    原文:https://blog.csdn.net/lovebyz/article/details/80704800  
    版权声明:本文为博主原创文章,转载请附上博文链接!

  • 相关阅读:
    快速排序
    归并排序
    堆排序
    通过先序和中序创建二叉树
    插入排序
    二叉排序树
    九宫重排
    字符串匹配 sunday算法
    傻逼数学题(math)
    最近点对学习笔记
  • 原文地址:https://www.cnblogs.com/Reallylzl/p/10880763.html
Copyright © 2011-2022 走看看