1.安装cuda
https://developer.nvidia.cn/cuda-downloads,可查看安装版本:
下载 安装:
wget https://developer.download.nvidia.com/compute/cuda/11.5.1/local_installers/cuda_11.5.1_495.29.05_linux.run sudo sh cuda_11.5.1_495.29.05_linux.run
已经安装了驱动,所以不选择Driver。等待后,安装成功:
添加路径参数:
export PATH="/usr/local/cuda-11.5/bin:$PATH" export LD_LIBRARY_PATH="/usr/local/cuda-11.5/lib64:$LD_LIBRARY_PATH"
测试是否安装成功:
#编译并测试设备 deviceQuery: cd /usr/local/cuda-11.5/samples/1_Utilities/deviceQuery sudo make ./deviceQuery
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.5, CUDA Runtime Version = 11.5, NumDevs = 1
Result = PASS
在.bashrc文件中添加:
export CUDA_HOME=/usr/local/cuda-11.5 export LD_LIBRARY_PATH=/usr/local/cuda-11.5/lib64:$LD_LIBRARY_PATH export PATH=/usr/local/cuda-11.5/bin:$PATH
检查.profile文件中自动执行。再查看cuda版本:
$: nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Thu_Nov_18_09:45:30_PST_2021 Cuda compilation tools, release 11.5, V11.5.119 Build cuda_11.5.r11.5/compiler.30672275_0
2.安装cudnn
https://developer.nvidia.cn/rdp/cudnn-archive#a-collapse742-10,查询版本。
下载安装,需要登陆账户。下载挺慢的。1.4G。
解压文件并复制:
tar zxvf cudnn-11.5-linux-x64-v8.3.0.98.tgz sudo cp cuda/include/cudnn.h /usr/local/cuda/include/ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/ sudo chmod a+r /usr/local/cuda/include/cudnn.h sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
因为版本升级,使用之前的命令:
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
查不出版本的结果。查看版本见3.2。
3.安装conda
wget -c https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh chmod 777 Miniconda3-latest-Linux-x86_64.sh sh Miniconda3-latest-Linux-x86_64.sh export PATH="/home/gaoxiang/miniconda3/bin:"$PATH
最后一行也需要添加到.bashrc文件中。创建conda环境:
conda create -n sc_37 python=3.7 conda activate sc_37
在conda环境的基础上安装pytorch:
3.1 安装pytorch
但是没有cuda11.5版本对应的pytorch,尝试安装11.3版本的是否有问题。https://pytorch.org/get-started/locally/。
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
查看是否可用GPU:
>>> import torch >>> torch.cuda.is_available() True >>> torch.cuda.device_count() 1 >>> torch.cuda.get_device_name(0) 'NVIDIA GeForce RTX 3090' >>> torch.cuda.current_device() 0
3.2 安装tensorflow
cudnn版本:
import torch torch.backends.cudnn.version() 8005
那么按照上图,安装2.4.0版本:
pip install tensorflow-gpu==2.4.0
conda install -c conda-forge tensorboardx
尝试:
from torch.utils.tensorboard import SummaryWriter
ok。