一, 挂载本地源镜像
1) 下载操作系统镜像
所有服务器操作系统必须统一,本平台只支持 CentOS 7.3 1611,镜像下载地址。
2) 上传镜像到服务器 ,假设上传在 root 下
3) 建立挂载点,挂载
mkdir /mnt/cdrom
# 挂载光驱
mount -t /root/CentOS-7-x86_64-DVD-1611.iso /mnt/cdrom
# 卸载光驱
umount /dev/hdc 或直接eject
#强制卸载命令:
(1) fuser -mk /dev/hdc
(2) eject
4) 配置本地yum 源
1.进入 /etc/yum.repos.d下新建一个目录backup,然后使用 mv *.repo backup/,将之前的yum源剪贴到目录中。
2.新建local.repo 内容如下:
[local]
name=CentOS-$releasever local
baseurl=file:///mnt/cdrom/
gpgcheck=0
enabled=1
3.对yum进行初始化操作
yum clean all //清除缓存
yum makecache //建立新缓存
4.使用yum list | wc -l 统计个数
二,安装nvidia驱动
1,安装前的一系列准备工作
yum -y install kernel-devel
yum -y install epel-release
yum -y install dkms
yum -y install gcc
注:利用uname -a
命令查看系统内核版本,安装kernel-devel,dkms时的版本需与之对应一致。如利用yum安装时版本不一致,就挂载本地源的方式进行安装。
2,禁用nouveau
由于nouveau 这个驱动和 Nvidia 驱动冲突,想要继续安装,则必须禁用此驱动。
- 在启动项中禁用
vim /etc/default/grub
#在GRUB_CMDLINE_LINUX
中添加 rd.driver.blacklist=nouveau nouveau.modeset=0
/etc/default/grub 文本内容如下:
# 设定超时时间,默认为5秒
GRUB_TIMEOUT=5
# 获得发行版名称(比如CentOS Linux)
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
# 该项将使用grub-set-default和grub-reboot命令来配置默认启动项
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
# 将会导入到每个启动项
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=cl/root rd.lvm.lv=cl/swap rhgb quiet rd.driver.blacklist=nouveau nouveau.modeset=0"
GRUB_DISABLE_RECOVERY="true"
2) 把驱动加入黑名单中
vim /etc/modprobe.d/blacklist.conf
#打开(新建)文件,加入
blacklist nouveau
# 或者执行下面的命令
echo -e "blacklist nouveau
options nouveau modeset=0" > /etc/modprobe.d/blacklist.conf
- 使用 dracut重新建立 initramfs image file :
* 备份 the initramfs file
$ sudo mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
* 重新建立 the initramfs file
$ sudo dracut -v /boot/initramfs-$(uname -r).img $(uname -r)
- 更新完配置后,重启
reboot
# 或者 init 3
- 检查nouveau driver确保没有被加载!
lsmod | grep nouveau
#应该返回空
3,安装nvidia驱动
# 先安装编译环境 gcc、kernel-devel、kernel-headers (已经装过则不需要再次安装)
#"kernel-devel-uname-r == $(uname -r)"可以确保安装与当前运行内核版本一样的kernel-header
yum -y install gcc kernel-devel "kernel-devel-uname-r == $(uname -r)" dkms
chmod +x NVIDIA-Linux-x86_64-390.87.run
sudo ./NVIDIA-Linux-x86_64-390.87.run
# 全部选择 YES
其中的一些选项可自行百度,至此安装完成。
3.1 ERROR1: 缺少 gcc
ERROR: Unable to find the development tool `cc` in your path; please make sure that you have the package 'gcc' installed. If gcc is installed on your system, then please check that `cc` is in your PATH.
解决方法:
yum install gcc
3.2 ERROR2: 缺少 kernel-devel
ERROR: Unable to find the kernel source tree for the currently running kernel. Please make sure you have installed the kernel source files for your kernel and that they are properly configured; on Red Hat Linux systems, for example, be sure you have the 'kernel-source' or 'kernel-devel' RPM installed. If you know the correct kernel source files are installed, you may specify the kernel source path with the '--kernel-source-path' command line option.
解决方法:
yum install kernel-devel
3.3 ERROR3: 缺少 kernels的
ERROR: Failed to run `/sbin/dkms build -m nvidia -v 390.87 -k 3.10.0-514.el7.x86_64`: Error! echo Your kernel headers for kernel 3.10.0-514.el7.x86_64 cannot be found at /lib/modules/3.10.0-514.el7.x86_64/build or /lib/modules/3.10.0-514.el7.x86_64/source.
解决方法:
# 检查/usr/src/kernels/有没有kernels的开发包
cd /usr/src/kernels/ && ls -al
# 有值则表示有
# 显示 3.10.0-957.10.1.el7.x86_64
cd /lib/modules/ && ls -al
# 显示 3.10.0-514.el7.x86_64
cd 3.10.0-514.el7.x86_64
# cd /lib/modules/$(uname -r)
rm -f build
ln -s /usr/src/kernels/3.10.0-957.10.1.el7.x86_64 build
4,查看状态 nvidia-smi ,正常显示即驱动安装完成。
二,安装cuda9
1, 安装cuda9注意一点,再安装步骤中的第二步,询问是否安装驱动时,选择
NO。因为上面步骤已经安装完成了nvidia驱动。
chmod +x cuda_9.0.103_384.59_linux.run
sudo ./cuda_9.0.103_384.59_linux.run
Do you accept the previously read EULA?
accept/decline/quit: accept
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.59?
(y)es/(n)o/(q)uit: n
Do you want to install the OpenGL libraries?
(y)es/(n)o/(q)uit [ default is yes ]: y
Do you want to run nvidia-xconfig?
This will update the system X configuration file so that the NVIDIA X driver
is used. The pre-existing X configuration file will be backed up.
This option should not be used on systems that require a custom
X configuration, such as systems with multiple GPU vendors.
(y)es/(n)o/(q)uit [ default is no ]: y
Install the CUDA 9.0 Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
[ default is /usr/local/cuda-9.0 ]:
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit:
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
Install the CUDA 9.0 Samples?
(y)es/(n)o/(q)uit: y