NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running

264,992

Solution 1

The solution by Markus lead me to a better solution. So it has to do with Secure Boot, but it is not necessary to deactivate.

To fix the problem, just do 3 steps: Deactivate the Nvidia driver by choosing X.Org with the Additional Drivers tool, reboot, then activate the Nvidia driver, reboot and enroll the key in Secure Boot.

Usually when you activate the Nvidia driver with the Additional Drivers tool, you are asked for a (new) password for Secure Boot. After reboot, the PC jumps into Secure Boot settings and you are asked to enroll a new MOK key, which must be confirmed with that same password. Afterwards, the driver will get access to the Nvidia card and will work.

Solution 2

If your nvidia-smi failed to communicate but you've installed the driver so many times, check prime-select.

  1. Run prime-select query to get all possible options. You should see at least nvidia | intel.
  2. Choose prime-select nvidia.
  3. If it says nvidia is already selected, select a different one, e.g. prime-select intel, then switch back to nvidia prime-select nvidia
  4. Reboot and check nvidia-smi.

Solution 3

You may want to install cuda toolkit. Using the following command to install it.

sudo apt install nvidia-cuda-toolkit

Once the installation is done, reboot the machine. nvidia-smi should work.

Solution 4

I disabled the Secure Boot and it worked pretty fine.

@rod-smith aswered another question more specific explaining how to do it, basically is a setup config, but he also write a good article about how to do that in here.

Solution 5

since I cannot comment on @Rodolfo's answer above (not enough reputation), I am adding a new answer.

On my machine I had to configure Secure Boot accordingly to my OS. I have an ASUS mainboard running Ubuntu 18.04 and tried to install NVIDIA CUDA 10.1 Update 2 with the packaged NVIDIA driver. I faced the same issue as described above. As it turned out, Secure Boot was set to Windows UEFI mode. Changing it to Other OS fixed it for me.

Share:
264,992

Related videos on Youtube

Rodolfo
Author by

Rodolfo

Software engineer passionate for working with cutting-edge technologies and learning new things. For the last years, I worked mainly in developing distributed and parallel applications for optimization problems. More recently I'm working to improve the Neo blockchain technology and its ecosystem.

Updated on September 18, 2022

Comments

  • Rodolfo
    Rodolfo over 1 year

    I just installed CUDA in a notebook like this:

    sudo apt-get install cuda
    

    Like said here.

    The compilation wokrs just fine but when I try to run I got the followin problem: CUDA error at file.cu:128 code=35(cudaErrorInsufficientDriver) "cudaStreamCreate(&(stream[i]))"

    My nvcc version:

    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2016 NVIDIA Corporation
    Built on Tue_Jan_10_13:22:03_CST_2017
    Cuda compilation tools, release 8.0, V8.0.61
    

    Graphics card info:

    lspci | egrep 'VGA|3D'
    00:02.0 VGA compatible controller: Intel Corporation Skylake Integrated Graphics (rev 06)
    02:00.0 3D controller: NVIDIA Corporation GM107M [GeForce GTX 960M] (rev a2)
    

    I also installed VirtualGL, bumblebee-nvidia, primus, freeglut3-dev. Following this.

    When I try to run something on bumblebee I got this: optirun glxspheres64

    [   41.413478] [ERROR]Cannot access secondary GPU - error: Could not load GPU driver
    [   41.413520] [ERROR]Aborting because fallback start is disabled.
    

    nvidia driver not working.

    nvidia-smi
    NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
    

    It looks like the nvidia 375 version is instaled but I can't make it works.

    whereis nvidia
    nvidia: /usr/lib/nvidia /usr/share/nvidia /usr/src/nvidia-375-375.66/nvidia
    

    And some driver info.

    modinfo nvidia_375
    filename:       /lib/modules/4.8.0-54-generic/updates/dkms/nvidia_375.ko
    alias:          char-major-195-*
    version:        375.66
    supported:      external
    license:        NVIDIA
    srcversion:     68751AFD79A210CEFFB8758
    alias:          pci:v000010DEd00000E00sv*sd*bc04sc80i00*
    alias:          pci:v000010DEd*sv*sd*bc03sc02i00*
    alias:          pci:v000010DEd*sv*sd*bc03sc00i00*
    depends:        
    vermagic:       4.8.0-54-generic SMP mod_unload modversions 
    parm:           NVreg_Mobile:int
    parm:           NVreg_ResmanDebugLevel:int
    parm:           NVreg_RmLogonRC:int
    parm:           NVreg_ModifyDeviceFiles:int
    parm:           NVreg_DeviceFileUID:int
    parm:           NVreg_DeviceFileGID:int
    parm:           NVreg_DeviceFileMode:int
    parm:           NVreg_UpdateMemoryTypes:int
    parm:           NVreg_InitializeSystemMemoryAllocations:int
    parm:           NVreg_UsePageAttributeTable:int
    parm:           NVreg_MapRegistersEarly:int
    parm:           NVreg_RegisterForACPIEvents:int
    parm:           NVreg_CheckPCIConfigSpace:int
    parm:           NVreg_EnablePCIeGen3:int
    parm:           NVreg_EnableMSI:int
    parm:           NVreg_TCEBypassMode:int
    parm:           NVreg_UseThreadedInterrupts:int
    parm:           NVreg_MemoryPoolSize:int
    parm:           NVreg_RegistryDwords:charp
    parm:           NVreg_RmMsg:charp
    parm:           NVreg_AssignGpus:charp
    

    I think it can be some driver version problem:

    dpkg -l | grep nvidia
    ii  bumblebee-nvidia                            3.2.1-10                                      amd64        NVIDIA Optimus support using the proprietary NVIDIA driver
    ii  nvidia-375                                  375.66-0ubuntu0.16.04.1                       amd64        NVIDIA binary driver - version 375.66
    ii  nvidia-375-dev                              375.66-0ubuntu0.16.04.1                       amd64        NVIDIA binary Xorg driver development files
    ii  nvidia-modprobe                             375.51-0ubuntu1                               amd64        Load the NVIDIA kernel driver and create device files
    ii  nvidia-opencl-icd-375                       375.66-0ubuntu0.16.04.1                       amd64        NVIDIA OpenCL ICD
    ii  nvidia-prime                                0.8.2                                         amd64        Tools to enable NVIDIA's Prime
    

    What am I missing?

    • Charlie Parker
      Charlie Parker almost 5 years
      how do you install drivers?
    • darthbhyrava
      darthbhyrava over 4 years
      Faced the same error, and none of the answers worked. What did work was a simple: $ reboot now.
    • KansaiRobot
      KansaiRobot over 3 years
      @darthbhyrava that does not work
  • samu
    samu almost 6 years
    It helped me to with nvidia driver 390 also! I never thought it might be because of secure boot, thx :)
  • Bill Kotsias
    Bill Kotsias over 4 years
    It didn't work here
  • Bill Kotsias
    Bill Kotsias over 4 years
    Thanks, I had to disable Secure Boot which was automatically re-enabled during a Windows/BIOS auto-update!!! Now nvidia works fine.
  • R. W. Prado
    R. W. Prado over 4 years
    UEFI Mode with Secure Boot desabled is already done here. =) Looks like it does not works for everyone, unfortunately.
  • loretoparisi
    loretoparisi over 4 years
    don't do this if you have cuda >= 10. It will downgrade your cuda to 9, that is available currently on ubuntu without CUDA PPA.
  • Inspi
    Inspi over 4 years
    uh oh, prime-select query didn't even list intel, I guess I have 2 problems now...
  • Inspi
    Inspi over 4 years
    would you mind explaining how you changed the secure boot to Other OS ?
  • Dinari
    Dinari over 4 years
    Doing sudo apt-get purge nvidia-* before, then the above line fixed everything CUDA related for me, installed 430 driver with CUDA 10.1
  • Florin Andrei
    Florin Andrei about 4 years
    made no difference
  • TheMechanic
    TheMechanic almost 4 years
    query doesn't list all possible options. From the script: 'query: checks which version is currently active and writes "nvidia", "intel" or "unknown" to the standard output'.
  • w-sky
    w-sky almost 4 years
    Secure Boot is the thing, but no need to deactivate! When activating the Nvidia driver, you must to enter a secure boot password; then reboot and import the new ROK key with that password in Secure Boot settings. Now the driver is active.
  • metsburg
    metsburg over 3 years
    Does not work. Same problem persists.
  • metsburg
    metsburg over 3 years
    Ubuntu asks for a MOK, and strangely enough, never recognizes the password.
  • w-sky
    w-sky over 3 years
    Sorry I mistyped above o.O – yes it's MOK not ROK. You have to enter a new password that you have to think up, and which you will then only need once for secure boot setup (enroll new MOK key in BIOS).
  • KansaiRobot
    KansaiRobot over 3 years
    message says the nvidia profile is already set (I have already rebooted so it does not work)
  • sam
    sam over 3 years
    @KansaiRobot did you solve it ?
  • Unknown artist
    Unknown artist over 2 years
    "My front door was getting stuck in the frame. I removed the lock and it worked pretty fine."
  • Gokul NC
    Gokul NC over 2 years
    If you're using Deep Learning VM and if nvidia-smi worked earlier, this might help to reinstall the default version again: sudo /opt/deeplearning/install-driver.sh
  • gui11aume
    gui11aume over 2 years
    That's the one that worked for me. I changed the CMOS battery of the motherboard; on reboot I got "bad" graphics and no CUDA. After disabling Secure Boot, everything went back to normal. Haven't tried the suggestion of @w-sky, though. Maybe it's better.
  • desmond13
    desmond13 over 2 years
    it did not work for me.
  • Pramesh Bajracharya
    Pramesh Bajracharya over 2 years
    Linus Torvalds once said "F*** You, NVIDIA !!" To this day, I suppose his/this statement.
  • desmond13
    desmond13 over 2 years
  • SKT
    SKT about 2 years
    Rebooting did the trick