Using and understanding systemd scheduling-related options in a desktop context

12,619

CPUScheduling{Policy|Priority}

The link tells you that CPUSchedulingPriority should only be set for fifo or rr ("real-time") tasks. You do not want to force real-time scheduling on services.

CPUSchedulingPolicy=other is the default.

That leaves batch and idle. The difference between them is only relevant if you have multiple idle-priority tasks consuming CPU at the same time. In theory batch gives higher throughput (in exchange for longer latencies). But it's not a big win, so it's not really relevant in this case.

idle literally starves if anything else wants the CPU. CPU priority is rather less significant than it used to be, for old UNIX systems with a single core. I would be happier starting with nice, e.g. nice level 10 or 14, before resorting to idle. See next section.

However most desktops are relatively idle most of the time. And when you do have a CPU hog that would pre-empt the background task, it's common for the hog only to use one of your CPUs. With that in mind, I would not feel too risky using idle in the context of an average desktop or laptop. Unless it has an Atom / Celeron / ARM CPU rated at or below about 15 watts; then I would want to look at things a bit more carefully.

Is nice level 'subverted' by the kernel 'autogroup' feature?

Yeah.

Autogrouping is a little weird. The author of systemd didn't like the heuristic, even for desktops. If you want to test disabling autogrouping, you can set the sysctl kernel.sched_autogroup_enabled to 0. I guess it's best to test by setting the sysctl in permanent configuration and rebooting, to make sure you get rid of all the autogroups.

Then you should be able to nice levels for your services without any problem. At least in current versions of systemd - see next section.

E.g. nice level 10 will reduce the weight each thread has in the Linux CPU scheduler, to about 10%. Nice level 14 is under 5%. (Link: full formula)

Appendix: is nice level 'subverted' by systemd cgroups?

The current DefaultCPUAccounting= setting defaults to off, unless it can be enabled without also enabling CPU control on a per-service basis. So it should be fine. You can check this in your current documentation: man systemd-system.conf

Be aware that per-service CPU control will also be enabled when any service sets CPUAccounting / CPUWeight / StartupCPUWeight / CPUShares / StartupCPUShares.

The following blog extract is out of date (but still online). The default behaviour has since changed, and the reference documentation has been updated accordingly.

As a nice default, if the cpu controller is enabled in the kernel, systemd will create a cgroup for each service when starting it. Without any further configuration this already has one nice effect: on a systemd system every system service will get an even amount of CPU, regardless how many processes it consists off. Or in other words: on your web server MySQL will get the roughly same amount of CPU as Apache, even if the latter consists a 1000 CGI script processes, but the former only of a few worker tasks. (This behavior can be turned off, see DefaultControllers= in /etc/systemd/system.conf.)

On top of this default, it is possible to explicitly configure the CPU shares a service gets with the CPUShares= setting. The default value is 1024, if you increase this number you'll assign more CPU to a service than an unaltered one at 1024, if you decrease it, less.

http://0pointer.de/blog/projects/resources.html

Share:
12,619

Related videos on Youtube

equaeghe
Author by

equaeghe

Updated on September 18, 2022

Comments

  • equaeghe
    equaeghe almost 2 years

    In systemd service files, one can set the following scheduling related options (from the systemd.exec man page, correct me if I'm wrong):

    Nice Sets the default nice level (scheduling priority) for executed processes. Takes an integer between -20 (highest priority) and 19 (lowest priority). See setpriority(2) for details.

    Which is the familiar nice level. It seems its effect is ‘subverted’ somewhat due to the ‘autogroup’ feature of recent linux kernels. So the options below may be what I'd really want to set to keep processes behaving nicely for my desktop experience.

    CPUSchedulingPolicy Sets the CPU scheduling policy for executed processes. Takes one of other, batch, idle, fifo or rr. See sched_setscheduler(2) for details.

    CPUSchedulingPriority Sets the CPU scheduling priority for executed processes. The available priority range depends on the selected CPU scheduling policy (see above). For real-time scheduling policies an integer between 1 (lowest priority) and 99 (highest priority) can be used. See sched_setscheduler(2) for details.

    CPUSchedulingResetOnFork Takes a boolean argument. If true, elevated CPU scheduling priorities and policies will be reset when the executed processes fork, and can hence not leak into child processes. See sched_setscheduler(2) for details. Defaults to false.

    I understand the last option. I gather from the explanation of the first two that I can choose a scheduling policy and then, given that policy, a priority. It is not entirely clear to me what I should choose for which kind of tasks. For example, is it safe to choose ‘idle’ for backup tasks (relatively CPU intensive, because deduplicating), or is another one better suited?

    In general, getting an understandable overview of each policy, with each of its priorities and suitability for specific purposes is what I am looking for. Also the interaction with the nice level is of interest.

    Next to CPU scheduling, there is IO scheduling. I guess this corresponds to ionice (correct me if I'm wrong).

    IOSchedulingClass Sets the I/O scheduling class for executed processes. Takes an integer between 0 and 3 or one of the strings none, realtime, best-effort or idle. See ioprio_set(2) for details.

    IOSchedulingPriority Sets the I/O scheduling priority for executed processes. Takes an integer between 0 (highest priority) and 7 (lowest priority). The available priorities depend on the selected I/O scheduling class (see above). See ioprio_set(2) for details.

    We here see the same structure as with the CPU scheduling. I'm looking for the same kind of information as well.

    For all the ‘Scheduling’ options, the referred to man pages are not clear enough for me, mostly in translating things to a somewhat technically-inclined desktop user's point of view.

    • Mark Stosberg
      Mark Stosberg over 7 years
      What particular performance problem are you trying to solve? Is something running too slowly or with too much lag for you? I use Ubuntu 16.04 with systemd as a desktop, with backups running in the background and have no performance problems related to relative priority.
    • equaeghe
      equaeghe over 7 years
      @MarkStosberg The question was prompted by background backup jobs running while foreground computational tasks on a dual-core system made the interface unresponsive. I then added a Nice option to the backup script and at the same time saw the other options. They may be relevant to that backup-caused issue or other issues I encounter in the future. So I would like to learn more about those other options, without it being related to a concrete issue.
    • Mark Stosberg
      Mark Stosberg over 7 years
      Each bit of documentation references a man page where you can find more documentation if you are interested. Does reading the referenced man pages for ioprio_set(2) and sched_setscheduler(2) help answer your questions?
    • equaeghe
      equaeghe over 7 years
      @MarkStosberg No, as indicated in my question. The man pages are not bad, but don't necessarily help me decide what to do (is adjusting nice values still the safe/right thing to do nowadays?). replies from experienced users of these options that can give concrete examples and know the impact of autogroup in such concrete cases is what I am hoping for.
    • Wisperwind
      Wisperwind almost 7 years
      I just stumbled upon these directives in pretty much the same context (background backups causing lag). I found that man sched(7) has a much more comprehensive description of the other two man pages. (It also mentions when the nice value applies).
  • equaeghe
    equaeghe about 6 years
    Sorry, but this is not an answer to the question. The whole point is that trying things out is difficult if one does not really understand the effects. And this question was prompted by (but is not about!) performance issues on a modern desktop.
  • Mark Stosberg
    Mark Stosberg about 6 years
    In only takes a few minutes to try any of the options. If you have a baseline benchmark and a performance goal, you can easily check if the option you tried moves you in the direction you want.
  • vonbrand
    vonbrand almost 5 years
    @equaeghe, frobbing random knobs without knowing what they do won't solve the problem either.
  • 0xC0000022L
    0xC0000022L almost 5 years
    Ample advice, but not an answer. This should rather be a comment. Sometimes the gut feeling of "this is waaay too slow" is enough of a benchmark to prompt action. For example using systemd-analyze with blame and plot I was able to cut down the boot time on an older RPi by over a minute. Sure, when it comes to analysis of the issue then it's required to measure instead of prematurely starting "optimization".