目录

Nvidia GPU驱动升级(centos7.6)

操作步骤

  1. 卸载旧驱动

    1
    
    nvidia-uninstall
    
  2. 重新创建initramfs

    1
    2
    3
    
    cd /boot
    mv initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
    dracut -vf initramfs-$(uname -r).img $(uname -r)
    
  3. 重启系统

  4. 下载新驱动

    CUDA Toolkit 11.2 Downloads

    1
    
    wget <https://developer.download.nvidia.com/compute/cuda/11.2.0/local_installers/cuda_11.2.0_460.27.04_linux.run>
    
  5. 安装新驱动

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    
    [root@VM-1-43-centos gpu]# sh cuda_11.2.0_460.27.04_linux.run
    ===========
    = Summary =
    ===========
    
    Driver:   Installed
    Toolkit:  Installed in /usr/local/cuda-11.2/
    Samples:  Installed in /root/, but missing recommended libraries
    
    Please make sure that
     -   PATH includes /usr/local/cuda-11.2/bin
     -   LD_LIBRARY_PATH includes /usr/local/cuda-11.2/lib64, or, add /usr/local/cuda-11.2/lib64 to /etc/ld.so.conf and run ldconfig as root
    
    To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.2/bin
    To uninstall the NVIDIA Driver, run nvidia-uninstall
    Logfile is /var/log/cuda-installer.log
    
  6. 执行nvidia-smi验证

补充

需注意,在卸载完驱动后,需要更新下initramfs,否则可能会在安装新的驱动重启后,执行nvidia-smi时遇到Failed to initialize NVML: Driver/library version mismatch错误。 如果已经安装完成并重启后遇到如上错误,可通过以下方法解决: rmmod nvidia_drm rmmod nvidia_modeset rmmod nvidia cd /boot mv initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak dracut -vf initramfs-$(uname -r).img $(uname -r)

相关参考文档

https://forums.developer.nvidia.com/t/failed-to-initialize-nvml-driver-library-version-mismatch/50910 https://stackoverflow.com/questions/43022843/nvidia-nvml-driver-library-version-mismatch