zhzhenqin преди 5 месеца
родител
ревизия
818ea01d63
променени са 5 файла, в които са добавени 114 реда и са изтрити 1 реда
  1. 112 0
      docs/N卡驱动及CUDA安装.md
  2. BIN
      docs/images/nvidia-driver-search.png
  3. BIN
      docs/images/nvidia-driver.png
  4. BIN
      docs/images/onnxruntime-cuda-version-mapping.png
  5. 2 1
      requirements.txt

+ 112 - 0
docs/N卡驱动及CUDA安装.md

@@ -0,0 +1,112 @@
+## OnnxRuntime 和 CUDA 适配版本
+
+![onnxruntime-cuda-version-mapping](images/onnxruntime-cuda-version-mapping.png)
+
+## Nvidia 驱动安装
+
+查看显卡型号
+
+```shell
+lspci | grep -i nvidia
+03:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1060 3GB] (rev a1)
+03:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)
+```
+
+然后到 Nvidia 官网下载驱动:https://www.nvidia.cn/drivers/lookup/
+
+![nvidia-driver-search](images/nvidia-driver-search.png)
+
+开始安装
+
+```shell
+sudo ./NVIDIA-1060-Linux-x86_64-550.100.run
+```
+
+安装完成后,重启电脑。输入:nvidia-smi,查看驱动是否安装成功。
+
+![nvidia-driver](images/nvidia-driver.png)
+
+## CUDA 安装
+
+```log
+===========
+= Summary =
+===========
+
+Driver:   Not Selected
+Toolkit:  Installed in /usr/local/cuda-11.4/
+Samples:  Installed in /home/yiidata/, but missing recommended libraries
+
+Please make sure that
+ -   PATH includes /usr/local/cuda-11.4/bin
+ -   LD_LIBRARY_PATH includes /usr/local/cuda-11.4/lib64, or, add /usr/local/cuda-11.4/lib64 to /etc/ld.so.conf and run ldconfig as root
+
+To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.4/bin
+***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 470.00 is required for CUDA 11.4 functionality to work.
+To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
+    sudo <CudaInstaller>.run --silent --driver
+
+Logfile is /var/log/cuda-installer.log
+```
+
+安装完成后,设置环境变量。
+
+打开主目录下的 .bashrc文件添加如下路径,例如我的.bashrc文件在/home/wangyuanwei下,如果没有找到,则按Ctrl+H键显示隐藏文件。
+
+```shell
+export CUDA_HOME="/usr/local/cuda-11.4"
+export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$CUDA_HOME/lib64"
+export PATH="$PATH:$CUDA_HOME/bin"
+```
+
+## cuDNN 安装
+
+这里选cuDNN Library for Linux(Deb安装容易出错)
+
+下载下来,解压下载的文件,可以看到cuda文件夹,在当前目录打开终端,执行如下命令:(也就是把下载的cudnn文件复制到相应的cuda文件中去)
+
+```shell
+sudo cp -v include/* /usr/local/cuda/include/
+sudo cp -v include/cudnn_version.h /usr/local/cuda/include/
+sudo cp -v lib/libcudnn* /usr/local/cuda/lib64/
+sudo chmod a+r /usr/local/cuda/include/cudnn.h
+sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
+```
+
+**测试cuda**
+
+终端进入虚拟环境中,用`nvcc --version`检查是否已经安装了cuda
+
+进入python中配置好虚拟环境后测试
+
+```python   
+import torch
+from torch.backends import cudnn
+
+print(torch.cuda.is_available())
+torch.zeros(1).cuda() #上面一行有可能是True但是cuda版本不匹配等原因实际上并没有安装成功,要看这一行报不报错
+print(cudnn.is_available())
+```
+
+## 卸载 CUDA
+
+卸载CUDA很简单,一条命令就可以了,主要执行的是CUDA自带的卸载脚本,读者要根据自己的cuda版本找到卸载脚本:
+
+```shell
+sudo /usr/local/cuda-10.0/bin/uninstall_cuda_10.0.pl
+```
+
+或
+
+Installation Guide Linux :: CUDA Toolkit Documentation (nvidia.com)
+
+```shell
+sudo apt-get --purge remove "*cublas*" "*cufft*" "*curand*" 
+sudo apt-get --purge remove "*cusolver*" "*cusparse*" "*npp*" "*nvjpeg*" "cuda*" "nsight*" 
+```
+
+卸载之后,还有一些残留的文件夹,之前安装的是CUDA 10.0。可以一并删除:
+
+```shell
+sudo rm -rf /usr/local/cuda-10.0/
+```

BIN
docs/images/nvidia-driver-search.png


BIN
docs/images/nvidia-driver.png


BIN
docs/images/onnxruntime-cuda-version-mapping.png


+ 2 - 1
requirements.txt

@@ -1,2 +1,3 @@
 onnxsim
-tensorrt==10.2.0
+tensorrt==10.2.0
+#tensorrt-cu11==10.2.0