NVIDIA-SMI couldn't find libnvidia-ml.so library

30,253

Solution 1

LD_PRELOAD=/usr/lib/nvidia-367/libnvidia-ml.so nvidia-smi

Solution 2

my error was solved by doing this way

This led me to finding another solution by looking into /etc/nvidia-container-runtime/config.toml file where the ldconfig is by default set to "@/sbin/ldconfig". This for some reason seems to not be working and also produces the error above:

root@banshee:/var/log# docker run --rm --gpus=all nvidia/cuda:11.4-base nvidia-smi
NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
Please also try adding directory that contains libnvidia-ml.so to your system PATH.

Changing the ldconfig path to "/sbin/ldconfig" (instead of "@/sbin/ldconfig") does indeed fix the problem:

root@banshee:/var/log# docker run --rm --gpus=all nvidia/cuda:11.4-base nvidia-smi
Sun Jan  5 20:39:45 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.64       Driver Version: 430.64       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 970     On   | 00000000:01:00.0  On |                  N/A |
| 32%   39C    P8    16W / 170W |    422MiB /  4038MiB |      3%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

source : github

Solution 3

I encountered this problem after some nvidia-docker containers crashed. libnvidia-ml.so presented at /usr/lib/nvidia-<version>, but nvidia-smi kept complaining.

I fixed the problem by sudo ldconfig.real

Share:
30,253

Related videos on Youtube

ant_1618
Author by

ant_1618

Updated on September 18, 2022

Comments

  • ant_1618
    ant_1618 almost 2 years

    I have the following Nvidia graphics card in my laptop

    ant@Anthill ~> lspci -k | grep -EA2 'VGA|3D'
    00:02.0 VGA compatible controller: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06)
        Subsystem: Lenovo 4th Gen Core Processor Integrated Graphics Controller
        Kernel driver in use: i915
    --
    07:00.0 3D controller: NVIDIA Corporation GK208M [GeForce GT 740M] (rev a1)
        Subsystem: Lenovo GK208M [GeForce GT 740M]
        Kernel modules: nvidiafb, nouveau
    

    I have installed drivers the following way

    sudo apt-add-repository ppa:graphics-drivers/ppa
    sudo apt-get install nvidia-370 nvidia-prime
    

    And cuda toolkit by downloading cuda-7.5 binary from nvidia official site

    sudo ./NVidia-cuda-7.5.run
    

    All these installations were done after shifting to tty and stopping XOrg

    sudo service lightdm stop
    

    Now after restarting

    ant@Anthill ~> nvidia-smi
    NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
    Please also try adding directory that contains libnvidia-ml.so to your system PATH.
    

    libnvidia-ml.so is present here

    ant@Anthill ~> ls /usr/lib/nvidia-370
    alt_ld.so.conf                 libGLX_indirect.so.0@            libnvidia-fatbinaryloader.so.370.28
    bin/                           libGLX_nvidia.so.0@              libnvidia-fbc.so.370.28
    ld.so.conf                     libGLX_nvidia.so.370.28          libnvidia-glcore.so.370.28
    libEGL_nvidia.so.0@            libGLX.so@                       libnvidia-glsi.so.370.28
    libEGL_nvidia.so.370.28        libGLX.so.0                      libnvidia-ifr.so@
    libEGL.so@                     libnvcuvid.so@                   libnvidia-ifr.so.1@
    libEGL.so.1                    libnvcuvid.so.1@                 libnvidia-ifr.so.370.28
    libGLdispatch.so.0             libnvcuvid.so.370.28             libnvidia-ml.so@
    libGLESv1_CM_nvidia.so.1@      libnvidia-cfg.so@                libnvidia-ml.so.1@
    libGLESv1_CM_nvidia.so.370.28  libnvidia-cfg.so.1@              libnvidia-ml.so.370.28
    libGLESv1_CM.so@               libnvidia-cfg.so.370.28          libnvidia-ptxjitcompiler.so.370.28
    libGLESv1_CM.so.1              libnvidia-compiler.so@           libnvidia-tls.so.370.28
    libGLESv2_nvidia.so.2@         libnvidia-compiler.so.1@         libnvidia-wfb.so.370.28
    libGLESv2_nvidia.so.370.28     libnvidia-compiler.so.370.28     libOpenGL.so@
    libGLESv2.so@                  libnvidia-eglcore.so.370.28      libOpenGL.so.0
    libGLESv2.so.2                 libnvidia-egl-wayland.so.370.28  tls/
    libGL.so@                      libnvidia-encode.so@             vdpau/
    libGL.so.1@                    libnvidia-encode.so.1@           xorg/
    libGL.so.1.0.0                 libnvidia-encode.so.370.28
    

    I tried adding this dir to the PATH and LD_LIBRARY_PATH also. Both did not work.

    Also,

    ls /dev | grep nvidia
    

    Yields nothing. That is no devices are present with /dev/nivida*

    Any suggestion to get this working? Where does nvidia-smi try to find the libnvidia-ml.so?

    • Thomas
      Thomas over 7 years
      Not really sure, but I think you should check with the command prime-select. This command will alter the lookup paths for the graphics library for Intel and Nvidia.
  • Rodolfo
    Rodolfo about 7 years
    I tried to use LD_PRELOAD=/usr/lib/nvidia-375/libnvidia-ml.so nvidia-smi but got NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
  • pdoherty926
    pdoherty926 over 2 years
    This worked for me, too. (Debian 11, Nvidia 390.147 drivers and Cuda 11.6)
  • wbadart
    wbadart over 2 years
    +1 on Debian 11, NVIDIA-SMI 460.91.03, Driver Version: 460.91.03, CUDA Version: 11.2