zoukankan      html  css  js  c++  java
  • Crontab和sudo中无法使用TensorFlow ImportError libcublas.so.9.0

    最近因为特殊的原因重新安装了python,但是引发了一个很严重的问题——TensorFlow不好使了。

    比如我下面这个执行文件test.py

    import tensorflow as tf
    print(tf.__version__)
    

    得到的结果:

    ubuntu@ubuntu:~/workspace$ sudo python test.py
    Traceback (most recent call last):
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
        from tensorflow.python.pywrap_tensorflow_internal import *
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
        _pywrap_tensorflow_internal = swig_import_helper()
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
        _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
      File "/usr/local/python3/lib/python3.6/imp.py", line 243, in load_module
        return load_dynamic(name, filename, file)
      File "/usr/local/python3/lib/python3.6/imp.py", line 343, in load_dynamic
        return _load(spec)
    ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "test.py", line 1, in <module>
        import tensorflow as tf
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in <module>
        from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 49, in <module>
        from tensorflow.python import pywrap_tensorflow
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
        raise ImportError(msg)
    ImportError: Traceback (most recent call last):
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
        from tensorflow.python.pywrap_tensorflow_internal import *
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
        _pywrap_tensorflow_internal = swig_import_helper()
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
        _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
      File "/usr/local/python3/lib/python3.6/imp.py", line 243, in load_module
        return load_dynamic(name, filename, file)
      File "/usr/local/python3/lib/python3.6/imp.py", line 343, in load_dynamic
        return _load(spec)
    ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
    
    
    Failed to load the native TensorFlow runtime.
    
    See https://www.tensorflow.org/install/errors
    
    for some common reasons and solutions.  Include the entire stack trace
    above this error message when asking for help.
    

    直接使用Python可以执行,但是sudo或者crontab定时任务都无法正常运行。

    使用find命令查找文件

    find / -name libcublas.so.9.0
    

    可以发现在我安装的目录下:/usr/local/cuda-9.0/lib64

    google后,发现有个链接跟我遇到的情况很像:https://github.com/tensorflow/tensorflow/issues/15604

    原来是动态链接库没有正常链接到,改正的方法就是在环境变量或者配置文件中添加。环境变量之前已经配过了,但是仍然不好使。

    就尝试使用配置文件:

    cat /etc/ld.so.conf
    
    include /etc/ld.so.conf.d/*.conf
    

    然后创建新的配置文件

    vi /etc/ld.so.conf.d/cuda.conf
    
    添加如下内容:
    /usr/local/cuda-9.0/lib64
    

    再次执行ldconfig -v | grep libcu

    ubuntu@ubuntu:/usr/local/cuda-9.0/lib64$ ldconfig -v | grep libcu
    /sbin/ldconfig.real: Path `/usr/lib/nvidia-384' given more than once
    /sbin/ldconfig.real: Path `/usr/lib32/nvidia-384' given more than once
    /sbin/ldconfig.real: Path `/lib/x86_64-linux-gnu' given more than once
    /sbin/ldconfig.real: Path `/usr/lib/x86_64-linux-gnu' given more than once
    /sbin/ldconfig.real: /lib/x86_64-linux-gnu/ld-2.23.so is the dynamic linker, ignoring
    
    	libcufft.so.9.0 -> libcufft.so.9.0.176
    	libcuinj64.so.9.0 -> libcuinj64.so.9.0.176
    	libcurand.so.9.0 -> libcurand.so.9.0.176
    	libcufftw.so.9.0 -> libcufftw.so.9.0.176
    	libcudart.so.9.0 -> libcudart.so.9.0.176
    	libcublas.so.9.0 -> libcublas.so.9.0.176
    	libcusparse.so.9.0 -> libcusparse.so.9.0.176
    	libcusolver.so.9.0 -> libcusolver.so.9.0.176
    	libcudnn.so.7 -> libcudnn.so.7.4.1
    	libcups.so.2 -> libcups.so.2
    
    

    再次执行sudo python test.py就没问题了。

  • 相关阅读:
    Nginx应用详解及配置
    mongodb复制+分片集原理
    memcached架构及缓存策略
    redis数据类型
    redis数据库安装 redis持久化及主从复制
    shell脚本-正则、grep、sed、awk
    kvm虚拟机管理基础
    kvm热添加和热迁移
    zabbix调用api检索方法
    kubernetes deployment升级和回滚
  • 原文地址:https://www.cnblogs.com/xing901022/p/10211407.html
Copyright © 2011-2022 走看看