Issues with Nvidia graphics driver and CUDA after apt-get upgrade

29,795

Solution 1

A friend was able to solve it for me!

The solution that he showed me was to (after removing all nvidia packages as before)

$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt-get install nvidia-364

then download the .run CUDA installer (for me it was cuda_7.5.18_linux.run) from Nvidia and be careful to choose "no" when asked whether you want to install the driver that is packaged with CUDA.

Solution 2

I had a similar problem. Was able to solve this by installing the recommended version of nvidia driver.

sudo apt-get install ubuntu-drivers-common

sudo ubuntu-drivers devices

sudo apt-get install <recommended version>
Share:
29,795

Related videos on Youtube

pterojacktyl
Author by

pterojacktyl

Updated on September 18, 2022

Comments

  • pterojacktyl
    pterojacktyl over 1 year

    I previously installed CUDA 7.5 on Ubuntu 14.04 using the "deb (network)" install from Nvidia. It has worked for a few months, until I ran sudo apt-get upgrade today. After doing this, I encountered the following

    $ nvidia-smi
    modprobe: ERROR: ../libkmod/libkmod-module.c:809 kmod_module_insert_module() could not find module by name='nvidia_352'
    modprobe: ERROR: could not insert 'nvidia_352': Function not implemented
    NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
    

    Running sudo nvidia-smi is no different. I am unable to login in GUI mode (it just goes back to the login screen after I enter my password), but I can access the terminal.

    I have been able to restore graphical functionality, however I am having difficulty re-installing CUDA after that. Can you please help me?

    Restoring graphics

    I have found that I can get the graphics to work again by doing

    $ sudo apt-get remove --purge nvidia*
    $ sudo apt-get autoremove
    

    and then editing /etc/apt/sources.list.d/cuda.list to remove all lines, then doing

    $ sudo apt-get install nvidia-352
    

    and rebooting the system. After this, nvidia-smi is working again. However, I still need to re-install CUDA.

    Trying to re-install CUDA

    I tried restoring the contents of /etc/apt/sources.list.d/cuda.list and then doing sudo apt-get install cuda. I noticed this error message:

    Loading new nvidia-352-352.93 DKMS files...
    Building only for 3.13.0-68-generic
    Building for architecture x86_64
    Building initial module for 3.13.0-68-generic
    ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/nvidia-352.0.crash'
    Error! Bad return status for module build on kernel: 3.13.0-68-generic (x86_64)
    

    After doing this, the system is returned to its behaviour at the start. For example, nvidia-smi prints the error message above, and after building and running deviceQuery I get a similar error:

    ./deviceQuery Starting...
    
     CUDA Device Query (Runtime API) version (CUDART static linking)
    
    modprobe: ERROR: ../libkmod/libkmod-module.c:809 kmod_module_insert_module() could not find module by name='nvidia_352'
    modprobe: ERROR: could not insert 'nvidia_352': Function not implemented
    cudaGetDeviceCount returned 38
    -> no CUDA-capable device is detected
    Result = FAIL
    

    I seem to recall that when I first installed CUDA, it would only work if I did it without updating the nvidia-352 package from the Nvidia repositories. However, now I don't seem to have the option of doing that, because when I run sudo apt-get install cuda it automatically upgrades the nvidia-352 package:

    Unpacking nvidia-352 (352.93-0ubuntu1) over (352.63-0ubuntu0.14.04.1) ...
    

    If I try to set the versions explicitly, I get

    $ sudo apt-get install cuda-drivers nvidia-352=352.63-0ubuntu0.14.04.1 nvidia-352-dev=352.63-0ubuntu0.14.04.1
    Some packages could not be installed. This may mean that you have
    requested an impossible situation or if you are using the unstable
    distribution that some required packages have not yet been created
    or been moved out of Incoming.
    The following information may help to resolve the situation:
    
    The following packages have unmet dependencies.
     cuda-drivers : Depends: nvidia-352 (>= 352.93) but 352.63-0ubuntu0.14.04.1 is to be installed
                    Depends: nvidia-352-dev (>= 352.93) but 352.63-0ubuntu0.14.04.1 is to be installed
    E: Unable to correct problems, you have held broken packages.
    

    In fact, if I try to use version 352.63-0ubuntu1 instead of 352.63-0ubuntu0.14.04.1 by doing

    $ sudo apt-get install nvidia-352=352.63-0ubuntu1
    

    then this is enough to break the graphical login and cause nvidia-smi to display the error message above.

    Diagnostics

    $ lspci | grep -i vga
    01:00.0 VGA compatible controller: NVIDIA Corporation GM200 [GeForce GTX TITAN X] (rev a1)
    
    $ dpkg -l | grep -i nvidia
    ii  bbswitch-dkms                                         0.7-2ubuntu1                                        amd64        Interface for toggling the power on nVidia Optimus video cards
    ii  libcuda1-352                                          352.93-0ubuntu1                                     amd64        NVIDIA CUDA runtime library
    ii  nvidia-352                                            352.93-0ubuntu1                                     amd64        NVIDIA binary driver - version 352.93
    ii  nvidia-352-dev                                        352.93-0ubuntu1                                     amd64        NVIDIA binary Xorg driver development files
    ii  nvidia-352-uvm                                        352.93-0ubuntu1                                     amd64        Transitional package for nvidia-352
    ii  nvidia-modprobe                                       352.93-0ubuntu1                                     amd64        Load the NVIDIA kernel driver and create device files
    ii  nvidia-opencl-icd-352                                 352.93-0ubuntu1                                     amd64        NVIDIA OpenCL ICD
    ii  nvidia-prime                                          0.6.2                                               amd64        Tools to enable NVIDIA's Prime
    ii  nvidia-settings                                       352.93-0ubuntu1                                     amd64        Tool for configuring the NVIDIA graphics driver
    
  • MARK
    MARK almost 7 years
    I had to issue the command "sudo modprobe nvidia" after the above commands then every thing was working
  • aerin
    aerin almost 6 years
    @MARK I'm getting error by modprobe: ERROR: could not insert 'nvidia_396': Required key not available. Any tips?