|
@@ -0,0 +1,112 @@
|
|
|
+## OnnxRuntime 和 CUDA 适配版本
|
|
|
+
|
|
|
+![onnxruntime-cuda-version-mapping](images/onnxruntime-cuda-version-mapping.png)
|
|
|
+
|
|
|
+## Nvidia 驱动安装
|
|
|
+
|
|
|
+查看显卡型号
|
|
|
+
|
|
|
+```shell
|
|
|
+lspci | grep -i nvidia
|
|
|
+03:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1060 3GB] (rev a1)
|
|
|
+03:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)
|
|
|
+```
|
|
|
+
|
|
|
+然后到 Nvidia 官网下载驱动:https://www.nvidia.cn/drivers/lookup/
|
|
|
+
|
|
|
+![nvidia-driver-search](images/nvidia-driver-search.png)
|
|
|
+
|
|
|
+开始安装
|
|
|
+
|
|
|
+```shell
|
|
|
+sudo ./NVIDIA-1060-Linux-x86_64-550.100.run
|
|
|
+```
|
|
|
+
|
|
|
+安装完成后,重启电脑。输入:nvidia-smi,查看驱动是否安装成功。
|
|
|
+
|
|
|
+![nvidia-driver](images/nvidia-driver.png)
|
|
|
+
|
|
|
+## CUDA 安装
|
|
|
+
|
|
|
+```log
|
|
|
+===========
|
|
|
+= Summary =
|
|
|
+===========
|
|
|
+
|
|
|
+Driver: Not Selected
|
|
|
+Toolkit: Installed in /usr/local/cuda-11.4/
|
|
|
+Samples: Installed in /home/yiidata/, but missing recommended libraries
|
|
|
+
|
|
|
+Please make sure that
|
|
|
+ - PATH includes /usr/local/cuda-11.4/bin
|
|
|
+ - LD_LIBRARY_PATH includes /usr/local/cuda-11.4/lib64, or, add /usr/local/cuda-11.4/lib64 to /etc/ld.so.conf and run ldconfig as root
|
|
|
+
|
|
|
+To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.4/bin
|
|
|
+***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 470.00 is required for CUDA 11.4 functionality to work.
|
|
|
+To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
|
|
|
+ sudo <CudaInstaller>.run --silent --driver
|
|
|
+
|
|
|
+Logfile is /var/log/cuda-installer.log
|
|
|
+```
|
|
|
+
|
|
|
+安装完成后,设置环境变量。
|
|
|
+
|
|
|
+打开主目录下的 .bashrc文件添加如下路径,例如我的.bashrc文件在/home/wangyuanwei下,如果没有找到,则按Ctrl+H键显示隐藏文件。
|
|
|
+
|
|
|
+```shell
|
|
|
+export CUDA_HOME="/usr/local/cuda-11.4"
|
|
|
+export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$CUDA_HOME/lib64"
|
|
|
+export PATH="$PATH:$CUDA_HOME/bin"
|
|
|
+```
|
|
|
+
|
|
|
+## cuDNN 安装
|
|
|
+
|
|
|
+这里选cuDNN Library for Linux(Deb安装容易出错)
|
|
|
+
|
|
|
+下载下来,解压下载的文件,可以看到cuda文件夹,在当前目录打开终端,执行如下命令:(也就是把下载的cudnn文件复制到相应的cuda文件中去)
|
|
|
+
|
|
|
+```shell
|
|
|
+sudo cp -v include/* /usr/local/cuda/include/
|
|
|
+sudo cp -v include/cudnn_version.h /usr/local/cuda/include/
|
|
|
+sudo cp -v lib/libcudnn* /usr/local/cuda/lib64/
|
|
|
+sudo chmod a+r /usr/local/cuda/include/cudnn.h
|
|
|
+sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
|
|
|
+```
|
|
|
+
|
|
|
+**测试cuda**
|
|
|
+
|
|
|
+终端进入虚拟环境中,用`nvcc --version`检查是否已经安装了cuda
|
|
|
+
|
|
|
+进入python中配置好虚拟环境后测试
|
|
|
+
|
|
|
+```python
|
|
|
+import torch
|
|
|
+from torch.backends import cudnn
|
|
|
+
|
|
|
+print(torch.cuda.is_available())
|
|
|
+torch.zeros(1).cuda() #上面一行有可能是True但是cuda版本不匹配等原因实际上并没有安装成功,要看这一行报不报错
|
|
|
+print(cudnn.is_available())
|
|
|
+```
|
|
|
+
|
|
|
+## 卸载 CUDA
|
|
|
+
|
|
|
+卸载CUDA很简单,一条命令就可以了,主要执行的是CUDA自带的卸载脚本,读者要根据自己的cuda版本找到卸载脚本:
|
|
|
+
|
|
|
+```shell
|
|
|
+sudo /usr/local/cuda-10.0/bin/uninstall_cuda_10.0.pl
|
|
|
+```
|
|
|
+
|
|
|
+或
|
|
|
+
|
|
|
+Installation Guide Linux :: CUDA Toolkit Documentation (nvidia.com)
|
|
|
+
|
|
|
+```shell
|
|
|
+sudo apt-get --purge remove "*cublas*" "*cufft*" "*curand*"
|
|
|
+sudo apt-get --purge remove "*cusolver*" "*cusparse*" "*npp*" "*nvjpeg*" "cuda*" "nsight*"
|
|
|
+```
|
|
|
+
|
|
|
+卸载之后,还有一些残留的文件夹,之前安装的是CUDA 10.0。可以一并删除:
|
|
|
+
|
|
|
+```shell
|
|
|
+sudo rm -rf /usr/local/cuda-10.0/
|
|
|
+```
|