zoukankan      html  css  js  c++  java
  • ubuntu安装cuda、cudnn和nvidia-docker


    本文参考自Ubuntu18.04安装CUDA10.1和cuDNN v7.6.5

    安装前的工作

    lspci | grep -i nvidia查看可用的nvidia设备——
    01:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
    01:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)
    uname -m && cat /etc/*release知晓操作系统的信息——64位的ubuntu20.04系统

    x86_64
    DISTRIB_ID=Ubuntu
    DISTRIB_RELEASE=20.04
    DISTRIB_CODENAME=focal
    DISTRIB_DESCRIPTION="Ubuntu 20.04.2 LTS"
    NAME="Ubuntu"
    VERSION="20.04.2 LTS (Focal Fossa)"
    ID=ubuntu
    ID_LIKE=debian
    PRETTY_NAME="Ubuntu 20.04.2 LTS"
    VERSION_ID="20.04"
    HOME_URL="https://www.ubuntu.com/"
    SUPPORT_URL="https://help.ubuntu.com/"
    BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
    PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
    VERSION_CODENAME=focal
    UBUNTU_CODENAME=focal
    

    gcc --version检查是否已安装gcc——version:(Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
    uname -rlinux内核版本——5.8.0-50-generic

    要安装的cuda和cudnn版本说明

    根据windows踩坑的情况,rtx1060适配的cuda版本10.1.105_418,cudnn版本10.1v7.6.5.32

    安装cuda

    下载好cuda10.1.105_418,由于没有ubuntu20.04对应的版本,我选择了18.10包。按照下载页面执行如下命令:

    sudo dpkg -i cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39_1.0-1_amd64.deb
    /*执行第一条命令打印出的内容
    Selecting previously unselected package cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39.
    (Reading database ... 186150 files and directories currently installed.)
    Preparing to unpack cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39_1.0-1_amd64.deb ...
    Unpacking cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39 (1.0-1) ...
    Setting up cuda-repo-ubuntu1810-10-1-local-10.1.105-418.39 (1.0-1) ...
    
    The public CUDA GPG key does not appear to be installed.
    To install the key, run this command:
    sudo apt-key add /var/cuda-repo-10-1-local-10.1.105-418.39/7fa2af80.pub
    */
    sudo apt-key add /var/cuda-repo-10-1-local-10.1.105-418.39/7fa2af80.pub
    sudo apt-get update
    sudo apt-get install cuda
    

    之后重启

    检查cuda的安装情况

    重启后执行nvidia-smi获取显卡信息。执行nvcc -V,建议“sudo apt install nvidia-cuda-toolkit”,不要如此做,因为本地已有与cuda对应的nvcc程序,从线上安装nvidia-cuda-toolkit可能造成toolkit与cuda的版本冲突,令cuda环境失效。(我曾经乱在主机上装nvidia-cuda-toolkit导致nvidia-smi命令无法使用,整个主机无法使用nv显卡,需要重新装cuda环境。)
    下面将nvcc添加到环境变量中

    vim ~/.bashrc
    # 添加一行:export PATH="/usr/local/cuda-10.1/bin:$PATH"
    source ~/.bashrc
    

    之后执行nvcc -V命令得到结果:
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2019 NVIDIA Corporation
    Built on Fri_Feb__8_19:08:17_PST_2019
    Cuda compilation tools, release 10.1, V10.1.105

    安装cudnn

    去nv网站下载cudnn-10.1-linux-x64-v7.6.5.32.tgz(cudnn for linux)

    tar -xzvf cudnn-10.1-linux-x64-v7.6.5.32.tgz
    sudo cp cuda/include/cudnn*.h /usr/local/cuda/include
    sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
    sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn* # 所有用户组赋上读权限
    vim ~/.bashrc
    # 添加一行:export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
    source ~/.bashrc
    

    安装nvidia-docker

    根据Docker-Getting Started-Installing on Ubuntu and Debian文档的说明,执行如下命令:

    curl https://get.docker.com | sh 
    && sudo systemctl --now enable docker
    
    distribution=$(. /etc/os-release;echo $ID$VERSION_ID) 
    && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - 
    && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
    
    sudo apt-get update
    sudo apt-get install -y nvidia-docker2
    sudo systemctl restart docker
    sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
    sudo docker images
    /*
    REPOSITORY    TAG         IMAGE ID       CREATED        SIZE
    nvidia/cuda   11.0-base   2ec708416bb8   8 months ago   122MB
    */
    

    在红米book14上的实践

    参考Win10+MX250+CUDA10.1+cuDNN+Pytorch1.4安装+测试全过程(吐血),使用的CUDA和cudnn还是这篇博文中用到的软件。按照本文的操作得到正确结果,中间遇到一个问题:执行nvidia-smi命令报错“VIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”,在BIOS设定好管理员的密码关闭安全启动模式,解决该问题
    本文创建于2021年 05月 05日 星期三 19:41:19 CST,修改于2021年7月19日14点44分

  • 相关阅读:
    [Angular 2] Refactoring mutations to enforce immutable data in Angular 2
    人物-释-鸠摩罗什:鸠摩罗什
    汉语-词语:说法
    汉语-词语:做法
    汉语-词语:办法
    汉语-词语:想法
    汉语-词语:看法
    汉语-词语:观念
    汉语-词语:逻辑
    汉语-词语:实质
  • 原文地址:https://www.cnblogs.com/tellw/p/14732368.html
Copyright © 2011-2022 走看看