Qemu TRIM and discard on a physical SSD device

9,203

Solution 1

Research

Qemu treats discard=unmap and discard=on the same as you can see in its source code:

block.c (L 1102): if (!strcmp(mode, "on") || !strcmp(mode, "unmap"))

It also seems to support multiple of the Linux ioctls as described here for writing or discarding zeros at the block level:

block/file-posix.c (L 744): if (ioctl(s->fd, BLKDISCARDZEROES, &arg) == 0 && arg)

block/file-posix.c (L L1621): if (ioctl(aiocb->aio_fildes, BLKZEROOUT, range) == 0)

block/file-posix.c (L 1788): if (ioctl(aiocb->aio_fildes, BLKDISCARD, range) == 0)

So based on this, block passthrough with SCSI emulation using options discard=unmap,detect-zeroes=unmap, unless you are using an old Qemu machine type, or a buggy Qemu version, should both work.

Example

Found an excellent presentation here.

Lessons learned from the presentation:

  1. You must be running Qemu/KVM as root or a user with CAP_SYS_RAWIO permission for discard to not be ignored by Linux.
  2. If your passthrough device is truly a SCSI disk, it should pay attention to the real SCSI UNMAP and WRITE SAME commands, and you can use scsi-block to passthrough.
  3. If not, you will have to emulate a SCSI disk with scsi-hd, which will send the discard commands through Qemu to the Linux block layer

For me, although using scsi-block to passthrough allowed access to stats and SMART info for the real device, and regular IO worked fine, the discard command was not supported.

Since my backing device is really SATA, so IDE, not a SCSI LUN, I am guessing that is the reason for no discard support this way.

Switching from scsi-block to scsi-hd, you will lose stats and SMART info, but gain discard.. so a trade off.

Personally, I did not experience any noticeable performance drop going from 'true passthrough' to 'emulated with passthrough' for my needs.

Here is an example of Virtio SCSI with emulated SCSI and a backing block device:

    -device virtio-scsi-pci,id=scsi \
    -blockdev driver=raw,node-name=disk.0,cache.direct=on,discard=unmap,file.driver=host_device,file.aio=native,file.filename=/dev/disk/by-id/ata-Samsung_SSD_840_PRO_Series_S12PNEAD233247L \
    -device scsi-hd,drive=disk.0,bus=scsi.0

The one part you will not find in Qemu documentation is the file.driver=host_device section.. it is needed for scsi-block to work, and seems not to hurt scsi-hd either, when we are using a real block device, not a file on the host filesystem.

Test

The blktrace tool I used to test Linux block level function calls is documented here.

You can run the blktrace and blkparse programs together to intercept discard calls:

blktrace -a discard -d /dev/disk/by-id/ata-Samsung_SSD_840_PRO_Series_S12PNEAD233247L -o - | blkparse -i -

Now when you run defrag /L c: or fstrim -v / in your VM you will see a lot of discards being printed on the host.. example snippet from output:

    8,0    1      493     0.641661863  3118  Q  DS 45458024 + 728 [qemu-system-x86]
    8,0    1      494     0.641664662  3118  G  DS 45458024 + 728 [qemu-system-x86]
    8,0    1      495     0.641665920  3118  I  DS 45458024 + 728 [qemu-system-x86]
    8,0    1      496     0.641669312  3118  D  DS 45458024 + 728 [qemu-system-x86]

So that is proof enough for me that discard is working.

Solution 2

You didn't provide information on the host system but as an example if you are using zfs to back your host storage, discard will not automatically trim your physical ssd.

You have two options.

  1. Enable autotrim in the zfs pool, it is off by default.

    zpool set autotrim=on pool
    
  2. Or run a manual trim.

    zpool trim pool
    

Also note even if autotrim is enabled it will skip very small empty areas, so its still good to do occasional manual trim even with autotrim.

Share:
9,203

Related videos on Youtube

nohupper
Author by

nohupper

Updated on September 18, 2022

Comments

  • nohupper
    nohupper over 1 year

    I am running Windows 7 in a Qemu/KVM with a passed through GPU which I use for work-related stuff. I recently got fed up by it's unprecedented slowness due to it running off a mechanical drive, so I added an SSD to my box to 'give' to my Windows-KVM. I'm using the following qemu command-line options for the 'passed through' disk: -drive file=/dev/disk/by-id/wwn-0x5002538d4002d61f,if=none,id=drive-scsi0-0-0-0,format=raw,discard=on" \ -device virtio-scsi-pci,id=scsi0" \ -device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0"

    I was hoping that the guest-OS TRIM commands would actually be passed-through to the physical drive on the host, but this seems to not be the case.

    Does "discard=on" only affect drives backed by image-files, and not by actual physical SSD's? If so, how would I be able to accomplish TRIM commands to the device on the guest os to be passed to the physical device on the host? Is using a image file on the host the only solution? I'm hoping for something better, since having a file-system on that disk would only create overhead, and I don't need it for anything else.

    • Michael Hampton
      Michael Hampton over 6 years
      What is your qemu version? On what Linux distribution?
    • nohupper
      nohupper over 6 years
      @Michael Hampton I'm sorry. Should've mentioned that. I'm running QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.16) on Ubuntu 16.04.3 with kernel 4.8.0-39 X64.
  • nohupper
    nohupper over 6 years
    I actually tried both discard=on and discard=unmap. Both seem to not actually trim the SSD itself. Just to be sure I'll change it back to unmap after work and try it again.
  • nohupper
    nohupper over 6 years
    Are you actually using physical drives for your images, or just image-files?
  • Michael Hampton
    Michael Hampton over 6 years
    I'm using ZFS zvols. So I can easily see that the space used decreases when I run, e.g. "Optimize" drives in Windows.
  • MrCalvin
    MrCalvin almost 4 years
    When you say scsi-block or scsi-hd do one of them equals virtio-scsi, and if so, which?
  • Marshall Porter
    Marshall Porter almost 4 years
    Actually virtio-scsi is an alias for virtio-scsi-pci, which is the PCI bus device. scsi-block and scsi-hd are both SCSI devices you can add/attach to your PCI bus device. You can see all possible devices with qemu-system-x86_64 -device help.
  • Marshall Porter
    Marshall Porter almost 4 years
    No, those are not even the same category of device.