zoukankan      html  css  js  c++  java
  • CUDA报错: Cannot create Cublas handle. Cublas won't be available. 以及:Check failed: status == CUBLAS_STATUS_SUCCESS (1 vs. 0) CUBLAS_STATUS_NOT_INITIALIZED

    Error描述:

    aita@aita-Alienware-Area-51-R5:~/AITA2/daisida/ssd-github/caffe$ make runtest -j8
    .build_release/tools/caffe
    caffe: command line brew
    usage: caffe <command> <args>
    
    commands:
      train           train or finetune a model
      test            score a model
      device_query    show GPU diagnostic information
      time            benchmark model execution time
    
      Flags from tools/caffe.cpp:
        -gpu (Optional; run in GPU mode on given device IDs separated by ','.Use
          '-gpu all' to run on all available GPUs. The effective training batch
          size is multiplied by the number of devices.) type: string default: ""
        -iterations (The number of iterations to run.) type: int32 default: 50
        -level (Optional; network level.) type: int32 default: 0
        -model (The model definition protocol buffer text file.) type: string
          default: ""
        -phase (Optional; network phase (TRAIN or TEST). Only used for 'time'.)
          type: string default: ""
        -sighup_effect (Optional; action to take when a SIGHUP signal is received:
          snapshot, stop or none.) type: string default: "snapshot"
        -sigint_effect (Optional; action to take when a SIGINT signal is received:
          snapshot, stop or none.) type: string default: "stop"
        -snapshot (Optional; the snapshot solver state to resume training.)
          type: string default: ""
        -solver (The solver definition protocol buffer text file.) type: string
          default: ""
        -stage (Optional; network stages (not to be confused with phase), separated
          by ','.) type: string default: ""
        -weights (Optional; the pretrained weights to initialize finetuning,
          separated by ','. Cannot be set simultaneously with snapshot.)
          type: string default: ""
    .build_release/test/test_all.testbin 0 --gtest_shuffle
    Cuda number of devices: 3
    Setting to use device 0
    Current device id: 0
    Current device name: GeForce GTX 1080 Ti
    Note: Randomizing tests' orders with a seed of 48866 .
    [==========] Running 2361 tests from 309 test cases.
    [----------] Global test environment set-up.
    [----------] 7 tests from DetectionOutputLayerTest/2, where TypeParam = caffe::GPUDevice<float>
    [ RUN      ] DetectionOutputLayerTest/2.TestForwardShareLocationTopK
    E0103 00:37:53.042623 19470 common.cpp:113] Cannot create Cublas handle. Cublas won't be available.
    [       OK ] DetectionOutputLayerTest/2.TestForwardShareLocationTopK (219 ms)
    [ RUN      ] DetectionOutputLayerTest/2.TestForwardNoShareLocationNeg0TopK
    [       OK ] DetectionOutputLayerTest/2.TestForwardNoShareLocationNeg0TopK (2 ms)
    [ RUN      ] DetectionOutputLayerTest/2.TestSetup
    [       OK ] DetectionOutputLayerTest/2.TestSetup (1 ms)
    [ RUN      ] DetectionOutputLayerTest/2.TestForwardNoShareLocationNeg0
    [       OK ] DetectionOutputLayerTest/2.TestForwardNoShareLocationNeg0 (2 ms)
    [ RUN      ] DetectionOutputLayerTest/2.TestForwardNoShareLocation
    [       OK ] DetectionOutputLayerTest/2.TestForwardNoShareLocation (2 ms)
    [ RUN      ] DetectionOutputLayerTest/2.TestForwardShareLocation
    [       OK ] DetectionOutputLayerTest/2.TestForwardShareLocation (1 ms)
    [ RUN      ] DetectionOutputLayerTest/2.TestForwardNoShareLocationTopK
    [       OK ] DetectionOutputLayerTest/2.TestForwardNoShareLocationTopK (2 ms)
    [----------] 7 tests from DetectionOutputLayerTest/2 (229 ms total)
    
    [----------] 2 tests from EuclideanLossLayerTest/2, where TypeParam = caffe::GPUDevice<float>
    [ RUN      ] EuclideanLossLayerTest/2.TestGradient
    F0103 00:37:53.068140 19470 math_functions.cu:110] Check failed: status == CUBLAS_STATUS_SUCCESS (1 vs. 0)  CUBLAS_STATUS_NOT_INITIALIZED
    *** Check failure stack trace: ***
        @     0x7f9210daf5cd  google::LogMessage::Fail()
        @     0x7f9210db1433  google::LogMessage::SendToLog()
        @     0x7f9210daf15b  google::LogMessage::Flush()
        @     0x7f9210db1e1e  google::LogMessageFatal::~LogMessageFatal()
        @     0x7f920c7ad43a  caffe::caffe_gpu_dot<>()
        @     0x7f920c7ec7c3  caffe::EuclideanLossLayer<>::Forward_gpu()
        @           0x48ae96  caffe::Layer<>::Forward()
        @           0x48d445  caffe::GradientChecker<>::CheckGradientSingle()
        @           0x4aea53  caffe::GradientChecker<>::CheckGradientExhaustive()
        @           0x848f0c  caffe::EuclideanLossLayerTest_TestGradient_Test<>::TestBody()
        @           0xa17c23  testing::internal::HandleExceptionsInMethodIfSupported<>()
        @           0xa1123a  testing::Test::Run()
        @           0xa11388  testing::TestInfo::Run()
        @           0xa11465  testing::TestCase::Run()
        @           0xa1273f  testing::internal::UnitTestImpl::RunAllTests()
        @           0xa12a63  testing::UnitTest::Run()
        @           0x47a98d  main
        @     0x7f920ba46830  __libc_start_main
        @           0x483b49  _start
        @              (nil)  (unknown)
    Makefile:526: recipe for target 'runtest' failed
    make: *** [runtest] Aborted (core dumped)
    

      


    解决方案1:

    sudo rm -rf .nv/

    解决方案2:

    I realized that there was an error with my CUDA installation, specifically with the cuBLAS library. You can check if yours has the same problem by running the sample program simpleCUBLAS: 

    1. cd /usr/local/cuda/samples/7_CUDALibraries/simpleCUBLAS # check if your samples are in the same directory
    2. make
    3. ./simpleCUBLAS



    I was getting an error when I tried to run it, so I reinstalled CUDA 8.0 and it solved the issue.


    此前尝试过:

    CUDA_VISIBLE_DEVICES=2

    原因是这个Demo默认使用所有探测到的CUDA 设备,而实验室的CUDA设备还有很多人在用,这会造成问题(可能是冲突或者资源不够,或者不被允许个人使用这么多个?)

    所以在运行的命令行前面要加上一定的限制:

    CUDA_VISIBLE_DEVICES=2 ./build/examples/openpose/openpose.bin --net_resolution "160x80" --video examples/media/video.avi

    使得被探测到的设备数量只有两个。

    然后发现还是运行不了,为什么呢?

    因为OS X上的SHELL并不具备显示远程窗口的功能,

    所以使用MobaXterm(在WIN10下),然后就成功了,因为这个软件直接内置了X server的接口

  • 相关阅读:
    服务器做系统备份时失败
    PHPMailer中文乱码问题的解决方法
    html字符串分行显示
    Oracle中取某几个数的最大值最小值
    分布式事务之 Seata
    org.apache.dubbo 2.7.7 服务端处理请求及时间轮(失败重试)
    org.apache.dubbo 2.7.7 服务消费源码
    org.apache.dubbo 2.7.7 服务发布注册源码
    org.apache.dubbo 2.7.x 再聚首
    spring-cloud-gateway 服务网关
  • 原文地址:https://www.cnblogs.com/sddai/p/10209502.html
Copyright © 2011-2022 走看看