How preemption works on Linux when a program has a timer less then 4ms?

linux scheduling sleep

5,882

Solution 1

See time(7), and the manpages it references. An excerpt:

High-Resolution Timers
   Before Linux 2.6.21, the accuracy of timer and sleep system calls  (see
   below) was also limited by the size of the jiffy.

   Since  Linux  2.6.21,  Linux  supports  high-resolution  timers (HRTs),
   optionally configurable via CONFIG_HIGH_RES_TIMERS.  On a  system  that
   supports  HRTs,  the  accuracy  of  sleep  and timer system calls is no
   longer constrained by the jiffy, but instead can be as accurate as  the
   hardware  allows  (microsecond accuracy is typical of modern hardware).
   You can determine  whether  high-resolution  timers  are  supported  by
   checking  the resolution returned by a call to clock_getres(2) or look‐
   ing at the "resolution" entries in /proc/timer_list.

   HRTs are not supported on all hardware architectures.  (Support is pro‐
   vided on x86, arm, and powerpc, among others.)

A comment suggests that you can't sleep less than a jiffy. That is incorrect; with HRTs, you can. Try this program:

/* test_hrt.c */
#include <time.h>
main()
{
        struct timespec ts;
        int i;

        ts.tv_sec = 0;
        ts.tv_nsec = 500000;  /* 0.5 milliseconds */
        for (i = 0; i < 1000; i++) {
                clock_nanosleep(CLOCK_MONOTONIC, 0, &ts, NULL);
        }
}

Compile it:

$ gcc -o test_hrt test_hrt.c -lrt

Run it:

$ time ./test_hrt

real    0m0.598s
user    0m0.008s
sys     0m0.016s

As you can see, 1000 iterations of a 0.5 millisecond delay took just a little over 0.5 seconds, as expected. If clock_nanosleep were truly waiting until the next jiffy before returning, it would have taken at least 4 seconds.

Now the original question was, what happens if your program was scheduled out during that time? And the answer is that it depends on the priority. Even if another program gets scheduled while your program is running, if your program is higher priority, or the scheduler decides that it's your program's time to run, it will start executing again after the clock_nanosleep timeout returns. It does not need to wait until the next jiffy for that to happen. You can try running the test program above while running other software that takes the CPU, and you'll see that it still executes in the same amount of time, especially if you increase the priority with e.g.

$ time sudo schedtool -R -p 99 -e ./test_hrt

Solution 2

Unless you're running a realtime kernel, I wouldn't use sleep times < 10ms anyway tbh. Even if the scheduler is willing to pre-empt another process for your timeout, jitter will probably dominate your actual sleep times.

Summary: avoid such small intervals unless you have a realtime kernel. If you can't change kernel, your best bet may be to pin your process to a dedicated CPU with SCHED_FIFO and busy-wait (or do other useful work) for anything less than about two jiffies.

Somehow the summary ended up longer than the original ... oh well.

5,882

roncsak

In the end, it's not going to matter how many breaths you took, but how many moments took your breath away.

Updated on September 18, 2022

Comments

roncsak almost 2 years

Jiffies in most Linux system are defaulted to 250 (4ms). The questions is that what happens when a program has a usleep() less then 4ms ? Of course it works as should when it is scheduled. But what happens when linux scheduler takes out this program to wait, because another program has to operate ? How does the preemption works in this case?

Should I avoid custom programs with such a small waiting? They couldn't be accurate, could it ?
Nils over 11 years

True, but the scheduler still runs with jiffies, while a (hardware) alarm can appear now between two jiffies...
cheshirecatalyst over 11 years

... and trigger an immediate wakeup for your process. What's your point? Sure, if you have a higher priority process also running then the scheduler might decide that the other process gets to run instead of you, but that's always a problem on a multitasking operating system, and can be solved with appropriate application of SCHED_RR, SCHED_FIFO, priorities, etc.
Nils over 11 years

The point is that the scheduler will not react "immediately" - it takes at least a full jiffie to react. That is the core problem of the question. So these high-precision timers help to MEASURE time more accurate - but they do not help for more fine-granular scheduling.
cheshirecatalyst over 11 years

That's simply not true. You can sleep for less than a jiffy. See my edit.
Nils over 11 years

Great edit +1 for your answer now. I did not know that a wakeup can happen between jiffies.