Blacklist a Nvidia gpu for qemu/kvm passthrough

20,247

Solution 1

I have run into similar problems like you (Lubuntu 16.04). This comes due to drivers/modules binding the devices to them before pci-stub is able to do this. You have at least two options in here:

The first and easiest one would be to blacklist the modules that claim the device. Type in lspci -knn | grep VGA -A 5 to see all your VGA pci devices and their device-number and their kernel modules.

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:128b] (rev a1)
    Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:8c93]
    Kernel driver in use: nouveau
    Kernel modules: nvidiafb, nouveau
01:00.1 Audio device [0403]: NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1)
    Subsystem: Micro-Star International Co., Ltd. [MSI] GK208 HDMI/DP Audio Controller [1462:8c93]
--
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204 [GeForce GTX 970] [10de:13c2] (rev a1)
    Subsystem: ZOTAC International (MCO) Ltd. GM204 [GeForce GTX 970] [19da:1366]
    Kernel driver in use: nouveau
    Kernel modules: nvidiafb, nouveau
02:00.1 Audio device [0403]: NVIDIA Corporation GM204 High Definition Audio Controller [10de:0fbb] (rev a1)
    Subsystem: ZOTAC International (MCO) Ltd. GM204 High Definition Audio Controller [19da:1366]

Now you need to check which driver is in use. For example nouveau grabbed my VGA-device 02:00.0 which i want to use for my VM, so I blacklist that one in:

sudo nano /etc/modprobe.d/blacklist.conf blacklist nouveau

and your are done.

In my case this would cause a problem since I have two nVidia VGA's installed (01:00.0 and 02:00.0) both running with the same driver. In my case I do not blacklist the driver.

I do manually unbind nouveau from my 02:00.0 VGA card, since i wanted to use that card for my VM-guest and the 01:00.0 VGA for my Linux host. Thanks to this guide i found out how to do so: https://lwn.net/Articles/143397/

Type in sudo tree /sys/bus/pci/drivers/nouveau. Exchange nouveau with whatever module grabbed your device.

You should recieve a list like this:

/sys/bus/pci/drivers/nouveau
├── 0000:01:00.0 -> ../../../../devices/pci0000:00/0000:00:03.0/0000:01:00.0
├── 0000:02:00.0 -> ../../../../devices/pci0000:00/0000:00:05.0/0000:02:00.0
├── bind
├── module -> ../../../../module/drm
├── new_id
├── remove_id
├── uevent
└── unbind

We see that driver nouveau has to devices binding to it: 0000:01:00.0 and 0000:02:00.0

To unbind and bind my graphic-card I first need to turn off lightdm.service. Therefor I open the console outside of the desktop environment with CTRL+ALT+F2 for example. Login as root and type systemctl stop lightdm.service

Now I can unbind the module from the graphics-card:

echo -n "0000:02:00.0" > /sys/bus/pci/drivers/nouveau/unbind

and bind it to whatever module I want (pci-stub or vfio-pci). I used vfio-pci.

echo -n "0000:02:00.0" > /sys/bus/pci/drivers/vfio-pci/bind

After that, you can start your desktopmanager again: systemctl start lightdm.service

If everything worked you should find your device binded to the module you specified by looking up with lspci -knn | grep VGA -A 5 again.

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:128b] (rev a1)
    Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:8c93]
    Kernel driver in use: nouveau
    Kernel modules: nvidiafb, nouveau
01:00.1 Audio device [0403]: NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1)
    Subsystem: Micro-Star International Co., Ltd. [MSI] GK208 HDMI/DP Audio Controller [1462:8c93]
--
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204 [GeForce GTX 970] [10de:13c2] (rev a1)
    Subsystem: ZOTAC International (MCO) Ltd. GM204 [GeForce GTX 970] [19da:1366]
    Kernel driver in use: vfio-pci
    Kernel modules: nvidiafb, nouveau
02:00.1 Audio device [0403]: NVIDIA Corporation GM204 High Definition Audio Controller [10de:0fbb] (rev a1)
    Subsystem: ZOTAC International (MCO) Ltd. GM204 High Definition Audio Controller [19da:1366]

Unfortunately this workaround loses effect after reboot. Yet i did not find out on how to make it persistent. Maybe anybody else can give me a hint. Something like a startscript would be possible, i guess. But it would be better beeing able to bind the device to a specific module without having to unbind it first. Imagine i would like to use the nvidia driver one day. In that case unbinding from nouveau would be useless since the graphics card would be bind to the nvidia module.

Solution 2

I'm setting up qemu-kvm passthrough as well, and i had the same problem as you. I'm using my integrated intel graphics card as my primary gpu, so i opened the nvidia settings and disabled hybrid graphics, so the nvidia card won't be used: (pic related)

After that i had no problem binding the card to vfio-pci.

It is possible that somehow the nvidia modules will cause you trouble when starting qemu, or that you don't have the option to turn off hybrid graphics. If this is the case, you can also try what i also did, and manually disable the nvidia modules using a script like this one from console mode (CTRL+ALT+F1):

#!/bin/bash
sudo service lightdm stop
sudo rmmod nvidia_uvm
sudo rmmod nvidia_drm
sudo rmmod nvidia_modeset
sudo rmmod nvidia
sudo service lightdm start

This stops the display manager (in my case lightdm), disables the nvidia modules in order, and restarts the display manager afterwards. Make sure to launch this in console mode, as running this from the desktop will most likely interrupt the script after the first line.

The nvidia modules will automatically load again when you reboot, but you can also load them again manually with:

modprobe nvidia nvidia_modeset nvidia_drm nvidia_uvm

Hope this helps.

Solution 3

Deactivate nvidia/nuveau using grub config.

There is the possibility to pass the module_blacklist=<module1>[,<module2>] (documentation) directive to the kernel via the grub2 command line. i was able to deactivate the nuveau and nvidia driver with the following addition to the GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub (don't forget to issue sudo update-grub):

module_blacklist=nvidia,nvidia_uvm,nvidia_drm,nvidia_modeset,nouveau

There is also the possibility to automatically generate grub entries with and without this option for each kernel: https://unix.stackexchange.com/questions/24670/choose-at-grub-menu-whether-nvidia-driver-should-be-used (first answer). But it turned out to be more cumbersome than expected. The ubuntu grub config is very complicated. Make sure to make a backup before tinkering with it.

This is especially helpful if you want to use a powerful NVIDIA card for gaming in a virtual machine using VGA Passthrough, yet have the option to use it for deep learning, such as tensorflow. Only a reboot required to switch between those two.

Share:
20,247

Related videos on Youtube

Mathyn
Author by

Mathyn

Updated on September 18, 2022

Comments

  • Mathyn
    Mathyn over 1 year

    I am trying to run Windows in a virtual machine while giving the VM a direct passthrough to the GPU for better performance.

    I have an integrated intel GPU (I will use this one for the host) and a Nvidia GTX980 (I want this one for the VM). I use Elementary OS 0.3.2 Freya 64 bit.

    I have followed this guide but am now stuck at step 2. I cannot get the Nvidia gpu to be blacklisted.

    To start with I do lspci -nn | grep NVIDIA

    This results in the following output

    01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:13c0] (rev a1)
    01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:0fbb] (rev a1)
    

    Next I added this line to the /etc/initramfs-tools/modules file.

    pci_stub ids=10de:13c0,10de:0fbb
    

    And I then reloaded using update-initramfs -u and afterwards rebooted.

    After the reboot when I run dmesg | grep pci-stub I get the following output:

    [    2.029626] pci-stub: add 10DE:13C0 sub=FFFFFFFF:FFFFFFFF cls=00000000/00000000
    [    2.029630] pci-stub: add 10DE:0FBB sub=FFFFFFFF:FFFFFFFF cls=00000000/00000000
    [    2.029637] pci-stub 0000:01:00.1: claimed by stub
    

    As you can see neither the audio or video are claimed by stub.

    I have also tried adding this option directly to the grub file in etc/default/grub so the GRUB_CMDLINE_LINUX_DEFAULT line looks like this:

    GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on pci-stub.ids=10de:13c0,10de:0fbb"
    

    But this also resulted in the Nvidia card not being blacklisted.

    Anyone got any idea what might be causing this?

  • Mike Ounsworth
    Mike Ounsworth almost 8 years
    I have spent a day now trying to decipher your post, and post #4 from the linked forum. Can you please be explicit about what commands you ran to blacklist your drivers, and choose new drivers?
  • Autumn
    Autumn about 7 years
    @Dan, your link is dead. Can you describe in general terms what you meant by "choose my driver to be nvidia"?
  • Dan
    Dan about 7 years
    I cant remember :), this was a year ago and I just started learning linux at that time. I edited my post to show my current steps I took
  • Mathyn
    Mathyn about 7 years
    Good one, so far I have had no success but maybe I will with this guide. Thanks!
  • guntbert
    guntbert about 6 years
    @spooky we just rejected your edit suggestion (askubuntu.com/questions/202404/…), because it changed too much. Please feel free to create your own answer though.
  • Guerlando OCs
    Guerlando OCs about 4 years
    this is what worked for me! But Im having this, don't know if its related: askubuntu.com/questions/1211676/…