No CUDA-capable device is detected although requirements are installed
Solution 1
Looks like you are on a laptop with Nvidia Optimus, have you switched to nvidia using prime-select nvidia
Solution 2
It should also be noted another potential cause of this behaviour is if the CUDA_VISIBLE_DEVICES
environment variable has been set to empty.
I experienced similar issues and it turned out this was accidentally getting set in my bash environment files.
Related videos on Youtube
a_guest
Updated on September 18, 2022Comments
-
a_guest over 1 year
Problem
I just installed
cuda
following the official installations instructions via the.deb
file. When it comes to section 6.2.2.3 (runningdeviceQuery
) I get the message that no CUDA-capable device was found although I'm pretty sure everything is setup correctly:$ ./bin/x86_64/linux/release/deviceQuery ./bin/x86_64/linux/release/deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) cudaGetDeviceCount returned 38 -> no CUDA-capable device is detected Result = FAIL
System information
Here is some information about my system:
$ uname -m && cat /etc/*release x86_64 DISTRIB_RELEASE=16.04 DISTRIB_DESCRIPTION="Ubuntu 16.04.2 LTS" VERSION="16.04.2 LTS (Xenial Xerus)" $ uname -r 4.4.0-64-generic $ lspci | grep -i nvidia 08:00.0 3D controller: NVIDIA Corporation GK208M [GeForce 920M] (rev a1) $ gcc --version gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
I also verified the kernel headers are installed:
$ sudo apt-get install linux-headers-$(uname -r) linux-headers-4.4.0-64-generic is already the newest version (4.4.0-64.85).
Installation of CUDA
So my system meets all the prerequisites. I then followed the instructions for the installation via apt-get (I installed
cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
).PATH
andLD_LIBRARY_PATH
are set to point to the required locations:$ echo $PATH /usr/local/cuda-8.0/bin:[...] $ echo $LD_LIBRARY_PATH /usr/local/cuda-8.0/lib64
Note that I did setup up
LD_LIBRARY_PATH
manually although this was mentioned to be necessary only for the runfile installation. However the error persists when resettingLD_LIBRARY_PATH
.The NVIDIA drivers also seem to be up-to-date:
$ cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX x86_64 Kernel Module 367.57 Mon Oct 3 20:37:01 PDT 2016 GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
Information about the cuda compiler driver:
$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2016 NVIDIA Corporation Built on Tue_Jan_10_13:22:03_CST_2017 Cuda compilation tools, release 8.0, V8.0.61
The instructions mention that this could be a problem with file permission:
If a CUDA-capable device and the CUDA Driver are installed but deviceQuery reports that no CUDA-capable devices are present, this likely means that the /dev/nvidia* files are missing or have the wrong permissions.
Those files didn't have the execution flag which I then added:
$ ls -al /dev/nvidia* crwxrwxrwx 1 root root 195, 0 Feb 27 13:17 /dev/nvidia0 crwxrwxrwx 1 root root 195, 255 Feb 27 13:17 /dev/nvidiactl crwxrwxrwx 1 root root 195, 254 Feb 27 13:17 /dev/nvidia-modeset crwxrwxrwx 1 root root 243, 0 Feb 27 13:17 /dev/nvidia-uvm crwxrwxrwx 1 root root 243, 1 Feb 27 18:24 /dev/nvidia-uvm-tools
However after running
deviceQuery
(which still fails) some of the permissions are reset:$ ls -al /dev/nvidia* crwxrwxrwx 1 root root 195, 0 Feb 27 13:17 /dev/nvidia0 crw-rw-rw- 1 root root 195, 255 Feb 27 13:17 /dev/nvidiactl crwxrwxrwx 1 root root 195, 254 Feb 27 13:17 /dev/nvidia-modeset crw-rw-rw- 1 root root 243, 0 Feb 27 13:17 /dev/nvidia-uvm crw-rw-rw- 1 root root 243, 1 Feb 27 18:24 /dev/nvidia-uvm-tools
That's a bit puzzling especially because I'm running
deviceQuery
withoutsudo
.Maybe related
Samples build fails
When I try to build the cuda samples via
make
it fails for one of them with the message/usr/bin/ld: cannot find -lnvcuvid collect2: error: ld returned 1 exit status Makefile:381: recipe for target 'cudaDecodeGL' failed make[1]: *** [cudaDecodeGL] Error 1
Which indeed seems to be missing:
$ ls /usr/local/cuda-8.0/lib64/libnvcuvid ls: cannot access '/usr/local/cuda-8.0/lib64/libnvcuvid': No such file or directory
Although the corresponding header file is there:
$ ls /usr/local/cuda-8.0/targets/x86_64-linux/include/nvcuvid.h /usr/local/cuda-8.0/targets/x86_64-linux/include/nvcuvid.h
Problem with static linking
The error which is raised from
deviceQuery
suggests a problem with static linking:CUDA Device Query (Runtime API) version (CUDART static linking)
AFAIK
LD_LIBRARY_PATH
is only responsible for dynamic linking. I found this question where a suggestion is to include/usr/lib/nvidia-current
to the linker path. However this directory doesn't exist within my installation:$ ls /usr/lib/nvidia-current ls: cannot access '/usr/lib/nvidia-current': No such file or directory
-
Artyom over 7 yearslooks like you are on a laptop, have you switched to nvidia using "prime-select nvidia"
-
Admin over 7 yearsAs above... Nvidia drivers won't load if the card isn't being used. No Nvidia drivers, no CUDA.
-
a_guest over 7 yearsThanks guys,
prime-select nvidia
helped! I guess this means I was running on onboard graphics before? -
Artyom over 7 years@a_guest Yep, you were using your onboard graphics. To check what are you are using easily; after login > top right button > about this computer, you'll see your graphics there. Also could you select my answer so I'll get internet points and this question will be marked answered to help others.
-
-
biocyberman over 6 yearsI installed 'nvidia-prime' and ran the command. This seems to solved my problem with Ubuntu 16.04 on a Dell server. 'It seems' because I haven't restarted the server, and I don't know if I have to run prime-select again.
-
Seth Bruder over 2 yearsA variant of this is if you have
CUDA_VISIBLE_DEVICES
set to the GUID of a card that you have replaced.