Please help configuring NVIDIA-SMI Ubuntu 20.04 on WSL 2
Solution 1
If nbody works then you have everything well configured. The problem is NVIDIA drivers limitations. https://docs.nvidia.com/cuda/wsl-user-guide/index.html#known-limitations
NVIDIA Management Library (NVML) APIs are not supported.
nvidia-smi is based on top of the NVIDIA Management Library (NVML).
Solution 2
An update to @onoma's answer. From https://docs.nvidia.com/cuda/wsl-user-guide/index.html#known-limitations :
6. nvidia-smi is not yet packaged for CUDA on WSL 2.
Hopefully this will be solved in future by nvidia.
Lars Ericson
Updated on September 18, 2022Comments
-
Lars Ericson over 1 year
Following this announcement and somewhat trying to follow this confusing thread, I
- installed Windows Version 10.0.20150 Build 20150
- installed NVidia Driver version 455.51
- installed Ubuntu 20.04 LTS from the Windows Store
I started Ubuntu and tried to run NVIDIA-SMI. It told me it wasn't there but that I could install it with one of these options:
Command 'nvidia-smi' not found, but can be installed with: sudo apt install nvidia-340 # version 340.108-0ubuntu2, or sudo apt install nvidia-utils-390 # version 390.132-0ubuntu2 sudo apt install nvidia-utils-435 # version 435.21-0ubuntu7 sudo apt install nvidia-utils-440 # version 440.82+really.440.64-0ubuntu6
Note that there is no
nvidia-utils-450
option corresponding to my 455.51, which the NVidia thread above said somewhere is required to make things go. I then ransudo apt install nvidia-utils-440 nvidia-smi
and it said "No devices found".
Then I found this guide. I uninstalled Ubunto 20.04, and then followed the guide. The guide asked me to
- install a vanilla Ubuntu (no release number), which I did instead of 20.04. (This turns out to give me 20.04).
- install Windows Terminal (I chose the Preview version)
- check to receive updates for related Windows programs
- update the kernel to 4.9.121
- install NVIDIA CUDA drivers on Windows 10 (I already did 455, have to check the CUDA release)
- install Docker
- install NVidia Container Toolkit
- test
The "install docker" part of that guide seems to be buggy. I couldn't get docker service to start. So I uninstalled my Ubuntu and repeated the steps up to that point, without touching Docker. Then (my version), the steps from the Docker point are (for docker part I am following these instructions to get Docker):
sudo apt-get update sudo apt-get upgrade sudo apt update sudo apt install apt-transport-https ca-certificates curl software-properties-common curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable" sudo apt update apt-cache policy docker-ce sudo apt install docker-ce sudo systemctl status docker
The last step fails. I get this message:
$ sudo systemctl status docker System has not been booted with systemd as init system (PID 1). Can't operate. Failed to connect to bus: Host is down
That led me here and the 4th and almost lowest-scored answer seems to work, except it needs to be run in background mode:
sudo dockerd & sudo usermod -aG docker your-user
Then I go back to the guide post-Docker install step and resume with
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
and this fails with
ERRO[2020-06-23T07:28:28.582848400-04:00] 5cd9b9d7011ba20f72971dd27900b23b2c0f6be656b0bd53b9e178944fe4eba6 cleanup: failed to delete container from containerd: no such container ERRO[2020-06-23T07:28:28.582946600-04:00] Handler for POST /v1.40/containers/5cd9b9d7011ba20f72971dd27900b23b2c0f6be656b0bd53b9e178944fe4eba6/start returned error: could not select device driver "" with capabilities: [[gpu]] docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]. ERRO[0018] error waiting for container: context canceled
Finally I went back to the NVidia announcement and did these steps:
sudo apt-get update distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container-experimental.list | sudo tee /etc/apt/sources.list.d/libnvidia-container-experimental.list sudo apt-get update sudo apt-get install -y nvidia-docker2 sudo dockerd & docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
SUCCESS: and I got a happy result:
> Windowed mode > Simulation data stored in video memory > Single precision floating point simulation > 1 Devices used for simulation GPU Device 0: "Quadro M500M" with compute capability 5.0 > Compute 5.0 CUDA device: [Quadro M500M] 3072 bodies, total time for 10 iterations: 3.817 ms = 24.724 billion interactions per second = 494.487 single-precision GFLOP/s at 20 flops per interaction
HOWEVER, per answer below, there is no NVIDIA-SMI, per known NVIDIA limitations.
FURTHER NOTE: The docker container test above works on Ubuntu shell. It does not work on Windows Powershell Preview with the Ubuntu tab.
-
Rémi about 3 yearsNote that you can run the Windows version of nvidia-smi from inside wsl.