zoukankan      html  css  js  c++  java
  • ubuntu安装cuda、cudnn和nvidia-docker


    本文参考自Ubuntu18.04安装CUDA10.1和cuDNN v7.6.5

    安装前的工作

    lspci | grep -i nvidia查看可用的nvidia设备——
    01:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
    01:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)
    uname -m && cat /etc/*release知晓操作系统的信息——64位的ubuntu20.04系统

    x86_64
    DISTRIB_ID=Ubuntu
    DISTRIB_RELEASE=20.04
    DISTRIB_CODENAME=focal
    DISTRIB_DESCRIPTION="Ubuntu 20.04.2 LTS"
    NAME="Ubuntu"
    VERSION="20.04.2 LTS (Focal Fossa)"
    ID=ubuntu
    ID_LIKE=debian
    PRETTY_NAME="Ubuntu 20.04.2 LTS"
    VERSION_ID="20.04"
    HOME_URL="https://www.ubuntu.com/"
    SUPPORT_URL="https://help.ubuntu.com/"
    BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
    PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
    VERSION_CODENAME=focal
    UBUNTU_CODENAME=focal
    

    gcc --version检查是否已安装gcc——version:(Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
    uname -rlinux内核版本——5.8.0-50-generic

    要安装的cuda和cudnn版本说明

    根据windows踩坑的情况,rtx1060适配的cuda版本10.1.105_418,cudnn版本10.1v7.6.5.32

    安装cuda

    下载好cuda10.1.105_418,由于没有ubuntu20.04对应的版本,我选择了18.10包。按照下载页面执行如下命令:

    sudo dpkg -i cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39_1.0-1_amd64.deb
    /*执行第一条命令打印出的内容
    Selecting previously unselected package cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39.
    (Reading database ... 186150 files and directories currently installed.)
    Preparing to unpack cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39_1.0-1_amd64.deb ...
    Unpacking cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39 (1.0-1) ...
    Setting up cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39 (1.0-1) ...
    
    The public CUDA GPG key does not appear to be installed.
    To install the key, run this command:
    sudo apt-key add /var/cuda-repo-10-1-local-10.1.105-418.39/7fa2af80.pub
    */
    sudo apt-key add /var/cuda-repo-10-1-local-10.1.105-418.39/7fa2af80.pub
    sudo apt-get update
    sudo apt-get install cuda
    

    之后重启

    检查cuda的安装情况

    重启后执行nvidia-smi获取显卡信息。执行nvcc -V,建议“sudo apt install nvidia-cuda-toolkit”,不要如此做,因为本地已有与cuda对应的nvcc程序,从线上安装nvidia-cuda-toolkit可能造成toolkit与cuda的版本冲突,令cuda环境失效。(我曾经乱在主机上装nvidia-cuda-toolkit导致nvidia-smi命令无法使用,整个主机无法使用nv显卡,需要重新装cuda环境。)
    下面将nvcc添加到环境变量中

    vim ~/.bashrc
    # 添加一行:export PATH="/usr/local/cuda-10.1/bin:$PATH"
    source ~/.bashrc
    

    之后执行nvcc -V命令得到结果:
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2019 NVIDIA Corporation
    Built on Fri_Feb__8_19:08:17_PST_2019
    Cuda compilation tools, release 10.1, V10.1.105

    安装cudnn

    去nv网站下载cudnn-10.1-linux-x64-v7.6.5.32.tgz(cudnn for linux)

    tar -xzvf cudnn-10.1-linux-x64-v7.6.5.32.tgz
    sudo cp cuda/include/cudnn*.h /usr/local/cuda/include
    sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
    sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn* # 所有用户组赋上读权限
    vim ~/.bashrc
    # 添加一行:export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
    source ~/.bashrc
    

    安装nvidia-docker

    根据Docker-Getting Started-Installing on Ubuntu and Debian文档的说明,执行如下命令:

    curl https://get.docker.com | sh 
    && sudo systemctl --now enable docker
    
    distribution=$(. /etc/os-release;echo $ID$VERSION_ID) 
    && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - 
    && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
    
    sudo apt-get update
    sudo apt-get install -y nvidia-docker2
    sudo systemctl restart docker
    sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
    sudo docker images
    /*
    REPOSITORY    TAG         IMAGE ID       CREATED        SIZE
    nvidia/cuda   11.0-base   2ec708416bb8   8 months ago   122MB
    */
    

    在红米book14上的实践

    参考Win10+MX250+CUDA10.1+cuDNN+Pytorch1.4安装+测试全过程(吐血),使用的CUDA和cudnn还是这篇博文中用到的软件。按照本文的操作得到正确结果,中间遇到一个问题:执行nvidia-smi命令报错“VIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”,在BIOS设定好管理员的密码关闭安全启动模式,解决该问题
    本文创建于2021年 05月 05日 星期三 19:41:19 CST,修改于2021年7月19日14点44分

  • 相关阅读:
    Java如何编写自动售票机程序
    install windows service
    redis SERVER INSTALL WINDOWS SERVICE
    上传文件
    This problem will occur when running in 64 bit mode with the 32 bit Oracle client components installed.
    解决Uploadify上传控件加载导致的GET 404 Not Found问题
    OracleServiceORCL服务不见了怎么办
    Access to the temp directory is denied. Identity 'NT AUTHORITYNETWORK SERVICE' under which XmlSerializer is running does not have sufficient permiss
    MSSQL Server 2008 数据库安装失败
    数据库数据导出成XML文件
  • 原文地址:https://www.cnblogs.com/tellw/p/14732368.html
Copyright © 2011-2022 走看看