KVM/QEMU hosts best-practice distro selection (and package)

5,113

Solution 1

I don’t think there’s a single “best practice” for “modern and performan[t] KVM host for production”. What’s best practice for you will depend on your environment, both in terms of hardware, contracts, and the workloads you need to support. However it’s generally safe to consider that “production” use generally involves using released products or projects, unless you have the resources to provide your own support without necessarily relying on your provider.

As you say, one might expect that Red Hat’s offerings would provide good support for virtualisation, since a lot of the upstream work is driven by Red Hat engineers. Thus for example, virtio-fs was added to RHEL kernel 4.18.0-149, and all the surrounding infrastructure is available in the related packages. Figuring that out isn’t necessarily obvious though, unless you have a RHEL 8.2 setup to try on; it isn’t mentioned in the release notes, as far as I can tell. Depending on the scale of your setup, you’d use RHEL, RHV, and/or RHOSP, and if you had any extra requirements, you’d discuss them with your account manager...

Solution 2

I believe I can answer that. KVM is evolving every second. You need to catch it but sanely ...

Short answer:

Proceed in three lanes making sure of stability and improving performance on top of it ...

Ubuntu 18.04 LTS ---> Archlinux (bleeding edge) 
X79 Chipset ---> Todays consumer and server systems 
Libvirt ---> Qemu commandline 

First you should use a tested-stable (relatively old) version and figure out how it works. I prefer ubuntu 18.04 LTS for that. And a stable vfio friendly mainboard which has good groupings and dual pci-e slot for dual gpus. And put the linux card at bottom. If you wanna go headless like me. The board also should have video oprom disabling option and detailed CSM options. I practiced on a X79 board with a 2680 Xeon. Start with windows and improve its performance first ... Bios uefi omvf etc ... Improvements: cpu and cache passthrough - virtio drivers - clock settings - instruction set optimizations - features - cpu pinning - iothread pinning - isolating cpus - hyperv settings - Nvidia GTX RTX - ATI RX passthroughs - usb hotplug scripts etc ...

After that you should start working new boards and cpu's at the top of the iceberg there are ryzen consumer boards with crappy chipsets and lacking bios settings with single gpu slot. And newest server systems with some grid gpu's ...

To reach to the top of the iceberg you need to work with archlinux bleeding edge , some bios editing etc ... So that you can find out how to create optimally stable and performant kvm 's

PS: I have reached %99.4 of bare metal and play 144FPS apex legends @2k high detail w/o shadows and dynamic lightning on a ryzen 9 3900XT with 2070 Super. No lookingglass on ubuntu 18.04 qemu-kvm and libvirt. If you want it you need to compile qemu and libvirt from sources. You better advance to archlinux or fedora (optionally).

Share:
5,113
MrCalvin
Author by

MrCalvin

Updated on September 18, 2022

Comments

  • MrCalvin
    MrCalvin almost 2 years

    If I want to run a modern production virtualizer I would expect to turn to RHEL/CentOS as RHEL is the main maintainer for most of the virtualizer "components".

    But the newest stable version of RHEL/CentOS comes with very old kernel and QEMU/libvirt versions, which presumably not even anywhere near support newer features and performance enhancements. E.g. trim support on the virtio-blk driver and VirtIO-FS for host folder sharing, just to mention some coming to my mind.

    What doesn't people do out there?

    • Are RHEL/CentOS's old versions backported to have those features supported anyway?
    • Does people install the new versions from "test" repositories? But is that advisable on prod hosts?
    • Are people just advised to run those old versions and wait 3-6 years before the arrive in the stable version?

    But then again, I read everywhere that people run these new features in the prod host e.g. on the virtio-fs page,

    Virtio-fs is used in production and has been available since Linux 5.4, QEMU 5.0, and libvirt 6.2.

    So my question in short: What is best practice for a modern and performance KVM host for production?

    I am aware of the balance between stable and test, and stable can newer run the just-released-versions, the classic dilemma in the disto world.

    It seems Debian stable use newer versions (but still old), OpenSUSE seems extrem old, Ubuntu is most modern.

    • Stephen Kitt
      Stephen Kitt about 4 years
      What version of RHEL are you looking at? RHEL 8 has QEMU 4.2.0 with a number of backported fixes (and Debian 10 has version 3.1). Not the latest as you say, but I just want to make sure we’re both looking at the same thing before writing an answer...
    • MrCalvin
      MrCalvin about 4 years
      The newest stable versions ;-) Specifying exact version numbers doesn't really make any difference to my question, in my view. Debian might have an older QEMU but as I recall the kernel is newer. But you say that RHEL's virtualizer components (kernel, qemu, libvirt) are backported with new features so they actually are more "modern" than they seems? Anywhere one can rather easily see those features? I'm sure I could find them in the source-code but that's not an easy path :-P
    • Stephen Kitt
      Stephen Kitt about 4 years
      Some people still consider that the latest stable RHEL is RHEL 7, I just wanted to make sure that wasn’t the case here.
    • dyasny
      dyasny about 4 years
      You should really read up on what backports are and how they work.
  • MrCalvin
    MrCalvin about 4 years
    Interesting virtio-fs was backported in their kernel 4.18. As far I can tell virtio-fs was introduce in Kernel 5.4 (upstream of course). Maybe most of the good new stuff is actually backported to their "old" kernel and packages.
  • Stephen Kitt
    Stephen Kitt about 4 years
    I can speak only for myself, and I’m not involved with virtualisation, but I don’t develop features in a vacuum, and it’s fair to imagine that applies to others...
  • MrCalvin
    MrCalvin about 4 years
    Just found this RHEL article on the subject access.redhat.com. It seem they sure do a lot of backporting, now it make more sense!
  • MrCalvin
    MrCalvin over 3 years
    Sure Arc is bleeding edge. But in practice I don't have time to constantly update the servers. In reality there goes between 6-18 month between each update, unfortunately. And it seems it is advised to update Arc regularly for not breaking dependencies etc.