Selecting a Linux I/O Scheduler

147,337

Solution 1

As documented in /usr/src/linux/Documentation/block/switching-sched.txt, the I/O scheduler on any particular block device can be changed at runtime. There may be some latency as the previous scheduler's requests are all flushed before bringing the new scheduler into use, but it can be changed without problems even while the device is under heavy use.

# cat /sys/block/hda/queue/scheduler
noop deadline [cfq]
# echo anticipatory > /sys/block/hda/queue/scheduler
# cat /sys/block/hda/queue/scheduler
noop [deadline] cfq

Ideally, there would be a single scheduler to satisfy all needs. It doesn't seem to exist yet. The kernel often doesn't have enough knowledge to choose the best scheduler for your workload:

  • noop is often the best choice for memory-backed block devices (e.g. ramdisks) and other non-rotational media (flash) where trying to reschedule I/O is a waste of resources
  • deadline is a lightweight scheduler which tries to put a hard limit on latency
  • cfq tries to maintain system-wide fairness of I/O bandwidth

The default was anticipatory for a long time, and it received a lot of tuning, but was removed in 2.6.33 (early 2010). cfq became the default some while ago, as its performance is reasonable and fairness is a good goal for multi-user systems (and even single-user desktops). For some scenarios -- databases are often used as examples, as they tend to already have their own peculiar scheduling and access patterns, and are often the most important service (so who cares about fairness?) -- anticipatory has a long history of being tunable for best performance on these workloads, and deadline very quickly passes all requests through to the underlying device.

Solution 2

It's possible to use a udev rule to let the system decide on the scheduler based on some characteristics of the hw.
An example udev rule for SSDs and other non-rotational drives might look like

# set noop scheduler for non-rotating disks
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="noop"

inside a new udev rules file (e.g., /etc/udev/rules.d/60-ssd-scheduler.rules). This answer is based on the debian wiki

To check whether ssd disks would use the rule, it's possible to check for the trigger attribute in advance:

for f in /sys/block/sd?/queue/rotational; do printf "$f "; cat $f; done

Solution 3

The aim of having the kernel support different ones is that you can try them out without a reboot; you can then run test workloads through the sytsem, measure performance, and then make that the standard one for your app.

On modern server-grade hardware, only the noop one appears to be at all useful. The others seem slower in my tests.

Share:
147,337

Related videos on Youtube

Robert S. Barnes
Author by

Robert S. Barnes

I'm a self taught programmer whose taken a break to go back to school and study Software Engineering. Currently, my main area's of interest are network programming and application level protocols, object oriented design and methodologies like Agile and TDD. SOreadytohelp

Updated on August 16, 2021

Comments

  • Robert S. Barnes
    Robert S. Barnes almost 3 years

    I read that it's supposedly possible to change the I/O scheduler for a particular device on a running kernel by writing to /sys/block/[disk]/queue/scheduler. For example I can see on my system:

    anon@anon:~$ cat /sys/block/sda/queue/scheduler 
    noop anticipatory deadline [cfq] 
    

    that the default is the completely fair queuing scheduler. What I'm wondering is if there is any use in including all four schedulers in my custom kernel. It would seem that there's not much point in having more than one scheduler compiled in unless the kernel is smart enough to select the correct scheduler for the correct hardware, specifically the 'noop' scheduler for flash based drives and one of the others for a traditional hard drive.

    Is this the case?

  • Robert S. Barnes
    Robert S. Barnes almost 15 years
    Great info, thanks! But my basic question still is unanswered, if I plug in a flash drive or my netbook runs off a flash disk as it's main drive is the kernel smart enough to pick noop instead of the default cfq? Or is it completely up to me to do it manually?
  • ephemient
    ephemient almost 15 years
    You can configure the kernel to use a different scheduler by default. It would be clever to automatically use noop on non-rotational media, but the kernel doesn't have that functionality. It kind of does have detection of non-rotational media, but it's not reliable as some disks misreport themselves, and it's not yet wired up to the I/O scheduler code anyhow.
  • Dani_l
    Dani_l about 9 years
    You can add udev rules to define the scheduler based on device characteristics, as in the debian wiki (wiki.debian.org/SSDOptimization#Low-Latency_IO-Scheduler) # set deadline scheduler for non-rotating disks ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"
  • SkyRaT
    SkyRaT almost 8 years
    Is there a way to change it for all drives at once at runtime? Likewise setting default scheduler by kernel command line param "elevator". Thanks.
  • Tagar
    Tagar over 7 years
    Great answer on automating detection of non-rotational media and applying IO scheduler only to those. Deadline is recommended not only for non-spinning media. Oracle recommends deadline io scheduler for database workloads. This Oracle's recommendation probably comes from the fact that deadline may handle better synchronous writes than other IO schedulers. Look for example for /sys/block/sdX/queue/iosched/writes_starved "deadline" scheduler tunable (there is no such tunable for reads). Databases may have bad performance if its synchronous redo writes are not coming through quickly.
  • DepressedDaniel
    DepressedDaniel over 7 years
    I don't get why this answer deserves so many downvotes. It isn't actually incorrect.