Today I encountered the problem that the NVDIA driver couldn’t communicate with the M60 Tesla card in an ESX 6.5 environment.
[root@esx001:~] nvidia-smi NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. [root@esx001:~] dmesg | grep NVIDIA 2017-07-25T14:08:11.328Z cpu29:70203)VisorFSTar: 2167: NVIDIA_V.v00 (8002303909659976911) as NVIDIA_V.v00 for 70077651 bytes 2017-07-25T14:08:11.497Z cpu28:70209)NVIDIA: Unloading nvidia module during vib install/upgrade. 2017-07-25T14:08:12.233Z cpu5:70218)ALERT: NVIDIA: module load failed during VIB install/upgrade. 2017-07-25T14:08:12.237Z cpu28:70219)NVIDIA: Starting vGPU Services. 2017-07-25T14:08:12.248Z cpu29:70222)NVIDIA: Starting Xorg during vib install/upgrade.
The host driver was installed succefully:
[root@esx001:~] esxcli software vib list | grep NVIDIA NVIDIA-VMware_ESXi_6.5_Host_Driver 367.106-1OEM.618.104.22.16898673 NVIDIA VMwareAccepted 2017-07-25
The M60 card was recognized within the ESX host
[root@esx001:~] lspci | grep Display 0000:86:00.0 Display controller: NVIDIA Corporation NVIDIATesla M60 [vmgfx0] 0000:87:00.0 Display controller: NVIDIA Corporation NVIDIATesla M60 [vmgfx1]
You need to disable the “DirectPath I/O” on the host. Navigate to Hardware –> PCI Devices . Make sure the graphic card is not selected as passtrough device. Thanks to Simon Schaber from NVIDIA who gave me the final clue.