How to unload kernel module 'nvidia-drm'?
Solution 1
I imagine you want to stop the display manager which is what I'd suspect would be using the Nvidia drivers.
After change to a text console (pressing Ctrl+Alt+F2) and logging in as root, use the following command to disable the graphical target, which is what keeps the display manager running:
# systemctl isolate multi-user.target
At this point, I'd expect you'd be able to unload the Nvidia drivers using modprobe -r
(or rmmod
directly):
# modprobe -r nvidia-drm
Once you've managed to replace/upgrade it and you're ready to start the graphical environment again, you can use this command:
# systemctl start graphical.target
Solution 2
CUDA Installation
1) Download the latest CUDA Toolkit
2) Switch to tty3 by pressing Ctl+Alt+F3
3) Unload nvidia-drm before proceeding.
3a) Isolate multi-user.target
sudo systemctl isolate multi-user.target
3b) Note that nvidia-drm is currently in use.
lsmod | grep nvidia.drm
3c) Unload nvidia-drm
sudo modprobe -r nvidia-drm
4d) Note that nvidia-drm is not in use anymore.
lsmod | grep nvidia.drm
5) Go to your download folder and run the cuda installation.
sudo sh cuda_10.1.168_418.67_linux.run
6) Answer any prompts during installation.
7) When installation has finished, confirm that the CUDA Version has been updated.
nvidia-smi
8) Start the GUI again.
sudo systemctl start graphical.target
Solution 3
lsof
lists any files that are in use by userspace processes. But nvidia_drm
is a kernel module, so lsof
won't necessarily see whether or not it is actually in use. (The module file won't be open because the kernel has already completely loaded it into RAM. But the module might be providing services to the userspace or other kernel components, and that is what prevents the unloading of the module.)
Run lsmod | grep nvidia.drm
and see the numbers to the right of the nvidia_drm
module name. The first number is simply the size of the module; the second is the use count. In order to successfully remove the module, the use count must be 0 first.
If the X11 server is running and using the nvidia
driver, then the nvidia_drm
kernel module will most assuredly be in use. So you'll need, at the very least, switch into text console and shutdown the X11 server. Usually this can be done by stopping whichever X Display Manager service you're using (depends on which desktop environment you're using).
As the error message said, if you are running nvidia-persistenced
, you'll need to stop that too before you can unload the nvidia_drm
module.
Solution 4
I solved this problem by disabling the GUI, rebooting, logging in and installing the driver, enabling GUI, and reboot.
Please make sure you know your username and password!!!
Open a terminal and write
sudo systemctl set-default multi-user.target
sudo reboot 0
Now login and you'll get to a terminal directly, install the driver Do note that I am installing here the 440.44 so you need to modify for your driver version.
sudo ./NVIDIA-Linux-x86_64-440.44.run
After installing the driver enable the GUI and Reboot:
sudo systemctl set-default graphical.target
sudo reboot 0
You should be done
In my case, nvidia-smi reported the new version 440.44, whine in the Ubuntu 18.04 Software & Updates Utilities, Additional Drivers Tab shows 435!! Another NVIDIA mystery, but heck my new docker works!!!
Solution 5
I had a similar problem.
*Reason: nvidia.drm package was in use
I fixed it by purging all NVIDIA packages.
Remove all previous NVIDIA installations with these 2 commands:
$ sudo apt-get purge nvidia*
$ sudo apt-get autoremove
Module should be removed.
Reboot and go forth.
Related videos on Youtube
Rodrigo
Programmer and biologist, I'd love to leave the world a little better than I've found it.
Updated on September 18, 2022Comments
-
Rodrigo over 1 year
I'm trying to install the most up-to-date NVIDIA driver in Debian Stretch. I've downloaded
NVIDIA-Linux-x86_64-390.48.run
from here, but when I try to dosudo sh ./NVIDIA-Linux-x86_64-390.48.run
as suggested, an error message appears.
ERROR: An NVIDIA kernel module 'nvidia-drm' appears to already be loaded in your kernel. This may be because it is in use (for example, by an X server, a CUDA program, or the NVIDIA Persistence Daemon), but this may also happen if your kernel was configured without support for module unloading. Please be sure to exit any programs that may be using the GPU(s) before attempting to upgrade your driver. If no GPU-based programs are running, you know that your kernel supports module unloading, and you still receive this message, then an error may have occured that has corrupted an NVIDIA kernel module's usage count, for which the simplest remedy is to reboot your computer.
When I try to find out who is using
nvidia-drm
(ornvidia_drm
), I see nothing.~$ sudo lsof | grep nvidia-drm lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs Output information may be incomplete. ~$ sudo lsof -e /run/user/1000/gvfs | grep nvidia-drm ~$
And when I try to remove it, it says it's being used.
~$ sudo modprobe -r nvidia-drm modprobe: FATAL: Module nvidia_drm is in use. ~$
I have rebooted and started in text-only mode (by pressing Ctrl+Alt+F2 before giving username/password), but I got the same error.
Besides it, how do I "know that my kernel supports module unloading"?
I'm getting a few warnings on boot up related to nvidia, no idea if they're related, though:
Apr 30 00:46:15 debian-9 kernel: nvidia: loading out-of-tree module taints kernel. Apr 30 00:46:15 debian-9 kernel: nvidia: module license 'NVIDIA' taints kernel. Apr 30 00:46:15 debian-9 kernel: Disabling lock debugging due to kernel taint Apr 30 00:46:15 debian-9 kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 375.82 Wed Jul 19 21:16:49 PDT 2017 (using threaded interrupts)
-
Admin about 6 yearscan you try to do it in rescue mode?
-
Admin about 6 yearsSee this issue on github :
systemctl stop systemd-logind
before unloading the modules. -
Admin about 6 years@GAD3R All I have is
systemctl stop systemd-logind.service
, but this closes the screen and takes me back to the graphic login, where I have to do Ctrl+Alt+F2 again.
-
-
Rodrigo about 6 yearsAfter Ctrl+Alt+F2,
lsmod
is telling me there's 1 process usingnvidia_drm
. So I didsudo /etc/init.d/gdm3 stop
, which wentok
in stopping it. But still 1 process inlsmod
. Now inside Gnome,ps aux | grep nvidia
shows[irq/129-nvidia]
and[nvidia]
but nonvidia-persistenced
. Also, herelsmod
shows 2 processes usingnvidia_drm
. I'm stuck. -
Rodrigo about 6 yearsGuess I'll wait for a less clumsy solution, if any shows up.
-
Rodrigo about 6 yearsI managed to uninstall it (using Filipe's answer), and install the new version to the point where there was no more working graphic mode. I had to format the PC and reinstall Debian. Now to a completely different set of bugs... All this just to see "GPU" as an option of rendering in Blender, and I still don't see it. Proprietary drivers sucks!
-
Rodrigo about 6 yearsI managed to uninstall it (using your answer), and install the new version to the point where there was no more working graphic mode. I had to format the PC and reinstall Debian. Now to a completely different set of bugs... All this just to see "GPU" as an option of rendering in Blender, and I still don't see it. Proprietary drivers sucks!
-
John Bollinger about 6 years@Rodrigo, I'm sorry you had such a poor experience. But that sort of problem is an example of why I recommend using packages instead of performing manual installations.
-
Rodrigo about 6 yearsYes, I prefer using packages. But I've read somewhere that GPU option in Blender wasn't enabled probably because of an outdated driver...
-
Don Kirkby over 5 yearsThis worked for me without the
modprobe
step. -
David Jung about 5 yearsYeah, I didn't need
modprobe
step neither. -
Scott - Слава Україні about 5 yearsSo, what are you saying? That the only solution is to reinstall? That obviously isn’t the only solution; other answers have been posted.
-
addison over 4 yearsI can't remove nvidia-drm even when in the text console. Any idea how I can forcebly remove it?
-
filbranden over 4 years@addison Note that it's not enough to just be on a text console, you need to stop X11 or Wayland or whatever is using the nvidia driver from the kernel. The point of the
systemctl isolate
command is to do that. But it's possible that's not correctly configured in your system... Checkps -ef
and see if you can spot what might be using the driver, then have that process stopped. That should allow you to unload the driver. -
Yuri Feldman about 4 yearsDisabling the GUI as above is the only thing that worked for me through ssh!
-
GG. almost 3 yearsSame for me, I was unable to unload nvidia-drm manually ("module not found" or something, although it was loaded) but disabling the GUI and rebooting did it.
-
Mr.Robot almost 3 yearsGreat answer! I was struggling with solution provided by the top answer for several hours (
systemctl isolate multi-user.target
only gives me black screen). Your answer saved my day! -
user1315621 over 2 years
sudo modprobe -r nvidia-drm
raises errormodprobe: FATAL: Module nvidia_drm is in use.
-
diffracteD about 2 yearsWorks but # modprobe -r nvidia-drm not necessary.