zoukankan      html  css  js  c++  java
  • GpuArrayException: No cuda device available尝试解决

    问题:

    在import keras或import ttheano时出现了以下:

    >>> import keras
    Using Theano backend.
    ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
    Traceback (most recent call last):
      File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 227, in <module>
        use(config.device)
      File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 214, in use
        init_dev(device, preallocate=preallocate)
      File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev
        **args)
      File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init
      File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init
    GpuArrayException: No cuda device available

    尝试了pip uninstall theano并且使用conda install theano安装后,出现了更为奇怪的问题,搜索之后发现是由于theano1.0.4和numpy16.0出现不兼容等问题,所以进行了卸载。

    重新使用pip install theano之后,进行操作,仍旧是同样的错误:

    >>> import theano
    ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
    Traceback (most recent call last):
      File "/data_d/old_home/home/.conda/envs/ib/python2.7/site-packages/theano/gpuarray/__init__.py", line 227, in <module>
        use(config.device)
      File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 214, in use
        init_dev(device, preallocate=preallocate)
      File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev
        **args)
      File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init
      File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init
    GpuArrayException: No cuda device available

    其他配置如下:

    [global]
    floatX = float32
    device =cuda
    [cuda]
    root=/usr/local/cuda-8.0
    
    ##.theanorc文件
    echo $PATH
    /data_d/old_home/home/.conda/envs/bin:/usr/local/cuda-8.0/bin:/data_d/public/miniconda2/bin:/usr/local/cuda-9.0/bin:/usr/local/sbin:
    /usr/local/bin:/usr/sbin:/usr/bin:/s:/usr/local/cuda-8.0/bin/local/games:/snap/bin:/usr/local/cuda-8.0/bin
    CUDA_VISIBLE_DEVICES=1
    CUDA_HOME=/usr/local/cuda-8.0
    PATH="$PATH:/usr/local/cuda-8.0/bin"
    LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64"
    
    #.bashrc文件
    at /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
    #define CUDNN_MAJOR      6
    #define CUDNN_MINOR      0
    #define CUDNN_PATCHLEVEL 21

    所使用的theano版本为1.0.4,对应的pygpu为0.7.6。

    是否是cuda-8.0文件夹的所有者被改变?不行。

    跑测试程序也是同样的报错:

    Using Theano backend.
    ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
    Traceback (most recent call last):
      File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 227, in <module>
        use(config.device)
      File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 214, in use
        init_dev(device, preallocate=preallocate)
      File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev
        **args)
      File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init
      File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init
    GpuArrayException: No cuda device available
    Training -----------
    ('train cost: ', array(4.1908903, dtype=float32))
    ('train cost: ', array(0.10415509, dtype=float32))
    ('train cost: ', array(0.01151281, dtype=float32))
    ('train cost: ', array(0.00458441, dtype=float32))
    
    Testing ------------
    40/40 [==============================] - 0s 5us/step
    ('test cost:', 0.005374030210077763)
    ('Weights=', array([[0.56634265]], dtype=float32), '
    biases=', array([2.001063], dtype=float32))

     尝试一:

    修改配置文件,改为了cuda0,结果import theano时:

    [global]
    floatX = float32
    device =cuda0
    [cuda]
    root=/usr/local/cuda-8.0
    >>> import theano
    ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
    Traceback (most recent call last):
      File "/data_d/old_home/home/.conda/env/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 227, in <module>
        use(config.device)
      File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 214, in use
        init_dev(device, preallocate=preallocate)
      File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev
        **args)
      File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init
      File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init
    GpuArrayException: GPU is too old for CUDA version

    https://blog.csdn.net/qq_33200967/article/details/80689543看到,需要检查cuda是否安装成功,由于直接用make报错,https://devtalk.nvidia.com/default/topic/1048902/cuda-setup-and-installation/cuda-samples-ubuntu-make-file-errors/

    所以使用了sudo make -k,发现输出结果为:

    ./deviceQuery Starting...
    
     CUDA Device Query (Runtime API) version (CUDART static linking)
    
    Detected 1 CUDA Capable device(s)
    
    Device 0: ""
      CUDA Driver Version / Runtime Version          9.0 / 8.0
      CUDA Capability Major/Minor version number:    2.1
      Total amount of global memory:                 963 MBytes (1010040832 bytes)
      ( 1) Multiprocessors, ( 48) CUDA Cores/MP:     48 CUDA Cores
      GPU Max Clock rate:                            1046 MHz (1.05 GHz)
      Memory Clock rate:                             875 Mhz
      Memory Bus Width:                              64-bit
      L2 Cache Size:                                 65536 bytes
      Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048)
      Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
      Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
      Total amount of constant memory:               65536 bytes
      Total amount of shared memory per block:       49152 bytes
      Total number of registers available per block: 32768
      Warp size:                                     32
      Maximum number of threads per multiprocessor:  1536
      Maximum number of threads per block:           1024
      Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
      Max dimension size of a grid size    (x,y,z): (65535, 65535, 65535)
      Maximum memory pitch:                          2147483647 bytes
      Texture alignment:                             512 bytes
      Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
      Run time limit on kernels:                     No
      Integrated GPU sharing Host Memory:            No
      Support host page-locked memory mapping:       Yes
      Alignment requirement for Surfaces:            Yes
      Device has ECC support:                        Disabled
      Device supports Unified Addressing (UVA):      Yes
      Device PCI Domain ID / Bus ID / location ID:   0 / 2 / 0
      Compute Mode:
         < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
    
    deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = NVS 315
    Result = PASS

    查看nvidia显卡驱动版本:https://blog.csdn.net/s_sunnyy/article/details/64121826

    cat /proc/driver/nvidia/version
    NVRM version: NVIDIA UNIX x86_64 Kernel Module  384.130  Wed Mar 21 03:37:26 PDT 2018
    GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10) 

    查看本机nvidia显卡:

    :/dev$ ls -l nvidia*
    crw-rw-rw- 1 root root 195,   0 5月  17 12:53 nvidia0
    crw-rw-rw- 1 root root 195,   1 5月  17 12:53 nvidia1
    crw-rw-rw- 1 root root 195, 255 5月  17 12:53 nvidiactl
    crw-rw-rw- 1 root root 195, 254 5月  17 12:53 nvidia-modeset
    crw-rw-rw- 1 root root 240,   0 5月  17 12:53 nvidia-uvm

    查看cudnn的版本:, conda list -n username

    cudatoolkit               10.0.130                      0  
    cudnn                     7.3.1                cuda10.0_0  

    似乎版本过高,https://blog.csdn.net/li57681522/article/details/82491617

    安装的cudatoolkit和cudnn程序包版本是:10.0

    but实际上,但根本就没有安装过cuda10.0。

    所以尝试卸载

    conda uninstall cudnn
    Fetching package metadata ...........
    Solving package specifications: .
    
    Package plan for package removal in environment /data_d/old_home/home/.conda/envs:
    
    The following packages will be REMOVED:
    
        cudnn: 7.3.1-cuda10.0_0
    
    Proceed ([y]/n)? y
    conda uninstall cudatoolkit
    Fetching package metadata ...........
    Solving package specifications: .
    
    Package plan for package removal in environment /data_d/old_home/home/.conda/envs:
    
    The following packages will be REMOVED:
    
        cudatoolkit: 10.0.130-0
        cupti:       10.0.130-0
    
    Proceed ([y]/n)? y

    使用: 

    conda install cudatoolkit=8.0
    Fetching package metadata ...........
    Solving package specifications: .
    
    Package plan for installation in environment /data_d/old_home/home/.conda/envs:
    
    The following NEW packages will be INSTALLED:
    
        cudatoolkit: 8.0-3
    
    Proceed ([y]/n)? y
    conda install cudnn=6.0
    Fetching package metadata ...........
    Solving package specifications: .
    
    Package plan for installation in environment /data_d/old_home/home/.conda/env:
    
    The following NEW packages will be INSTALLED:
    
        cudnn: 6.0.21-cuda8.0_0
    
    Proceed ([y]/n)? y
    cudatoolkit               8.0                           3  
    cudnn                     6.0.21                cuda8.0_0

    查询结果如上。

    结果依旧同样的错误。

    GpuArrayException: No cuda device available

    尝试在新环境下重新安装Cuda等。https://blog.csdn.net/lyy14011305/article/details/59500819

    按照这个http://deeplearning.net/software/theano/install_ubuntu.html安装numpy heano等包时,出现以下问题:

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/__init__.py", line 156, in <module>
        import theano.gpuarray
    
    。。。
    AttributeError: ('The following error happened while compiling the node', DnnVersion(), '
    ', "'module' object has no attribute '_get_ndarray_c_version'")

    https://github.com/pymc-devs/pymc3/issues/3340的解决办法是将theano升级为1.0.4(conda安装的为1.0.3),但是在升级时遇到了问题:

     conda install theano=1.0.4
    Fetching package metadata ...........
    
    PackageNotFoundError: Packages missing in current channels:
                
      - theano 1.0.4*
    
    We have searched for the packages in the following channels:
                
      - https://repo.continuum.io/pkgs/main/linux-64
      - https://repo.continuum.io/pkgs/main/noarch
      - https://repo.continuum.io/pkgs/free/linux-64
      - https://repo.continuum.io/pkgs/free/noarch
      - https://repo.continuum.io/pkgs/r/linux-64
      - https://repo.continuum.io/pkgs/r/noarch
      - https://repo.continuum.io/pkgs/pro/linux-64
      - https://repo.continuum.io/pkgs/pro/noarch

    尝试将numpy降到1.15

    conda install numpy=1.15
    Fetching package metadata ...........
    Solving package specifications: .
    
    Package plan for installation in environment /data_d/old_home/home/.conda/envs/xhs2:
    
    The following NEW packages will be INSTALLED:
    
        mkl_fft:    1.0.12-py27ha843d7b_0
        numpy:      1.15.4-py27h7e9f1db_0
    
    The following packages will be DOWNGRADED:
    
        numpy-base: 1.16.4-py27hde5b4d6_0 --> 1.15.4-py27hde5b4d6_0
    
    Proceed ([y]/n)? y

    没有了上面的AttributeError的错误,但是之后报的错仍旧是一模一样,当.theanorc中device =cuda0时,报错:

    GpuArrayException: GPU is too old for CUDA version

    当设置为:device =cuda时,报错:

    GpuArrayException: No cuda device available
  • 相关阅读:
    a标签去除默认样式
    js获取浏览器的get传值
    apache启动的时候报错非法协议
    获取iframe引入页面内的元素
    百度地图,画多边形后获取中心点
    echarts重写提示框信息,使提示框内的数字每3位以逗号分割
    echarts图例和图例文字位置的设置
    websocket socketJs
    winds添加静态路由
    pscp命令详解
  • 原文地址:https://www.cnblogs.com/BlueBlueSea/p/10989917.html
Copyright © 2011-2022 走看看