Precise thread sleep needed. Max 1ms error

c++ multithreading winapi loops sleep

28,248

Solution 1

From the question tags I suppose you are on windows. Take a look at Multimedia Timers, they advertise precision under 1ms. Another options is to use Spin Locks but this will basically keep a cpu core at maximum usage.

Solution 2

I was looking for lightweight cross-platform sleep function that is suitable for real time applications (i.e. high resolution/high precision with reliability). Here are my findings:

Scheduling Fundamentals

Giving up CPU and then getting it back is expensive. According to this article, scheduler latency could be anywhere between 10-30ms on Linux. So if you need to sleep less than 10ms with high precision then you need to use special OS specific APIs. The usual C++11 std::this_thread::sleep_for is not high resolution sleep. For example, on my machine, quick tests shows that it often sleeps for at least 3ms when I ask it to sleep for just 1ms.

Linux

Most popular solution seems to be nanosleep() API. However if you want < 2ms sleep with high resolution than you need to also use sched_setscheduler call to set the thread/process for real-time scheduling. If you don't than nanosleep() acts just like obsolete usleep which had resolution of ~10ms. Another possibility is to use alarms.

Windows

Solution here is to use multimedia times as others have suggested. If you want to emulate Linux's nanosleep() on Windows, below is how (original ref). Again, note that you don't need to do CreateWaitableTimer() over and over if you are calling sleep() in loop.

#include <windows.h>    /* WinAPI */

/* Windows sleep in 100ns units */
BOOLEAN nanosleep(LONGLONG ns){
    /* Declarations */
    HANDLE timer;   /* Timer handle */
    LARGE_INTEGER li;   /* Time defintion */
    /* Create timer */
    if(!(timer = CreateWaitableTimer(NULL, TRUE, NULL)))
        return FALSE;
    /* Set timer properties */
    li.QuadPart = -ns;
    if(!SetWaitableTimer(timer, &li, 0, NULL, NULL, FALSE)){
        CloseHandle(timer);
        return FALSE;
    }
    /* Start & wait for timer */
    WaitForSingleObject(timer, INFINITE);
    /* Clean resources */
    CloseHandle(timer);
    /* Slept without problems */
    return TRUE;
}

Cross Platform Code

Here's the time_util.cc which implements sleep for Linux, Windows and Apple's platforms. However notice that it doesn't set real-time mode using sched_setscheduler as I mentioned above so if you want to use for <2ms then that's something you need to do additionally. One other improvement you can make is to avoid calling CreateWaitableTimer for Windows version over and over again if you are calling sleep in some loop. For how to do this, see example here.

#include "time_util.h"

#ifdef _WIN32
#  define WIN32_LEAN_AND_MEAN
#  include <windows.h>

#else
#  include <time.h>
#  include <errno.h>

#  ifdef __APPLE__
#    include <mach/clock.h>
#    include <mach/mach.h>
#  endif
#endif // _WIN32

/**********************************=> unix ************************************/
#ifndef _WIN32
void SleepInMs(uint32 ms) {
    struct timespec ts;
    ts.tv_sec = ms / 1000;
    ts.tv_nsec = ms % 1000 * 1000000;

    while (nanosleep(&ts, &ts) == -1 && errno == EINTR);
}

void SleepInUs(uint32 us) {
    struct timespec ts;
    ts.tv_sec = us / 1000000;
    ts.tv_nsec = us % 1000000 * 1000;

    while (nanosleep(&ts, &ts) == -1 && errno == EINTR);
}

#ifndef __APPLE__
uint64 NowInUs() {
    struct timespec now;
    clock_gettime(CLOCK_MONOTONIC, &now);
    return static_cast<uint64>(now.tv_sec) * 1000000 + now.tv_nsec / 1000;
}

#else // mac
uint64 NowInUs() {
    clock_serv_t cs;
    mach_timespec_t ts;

    host_get_clock_service(mach_host_self(), SYSTEM_CLOCK, &cs);
    clock_get_time(cs, &ts);
    mach_port_deallocate(mach_task_self(), cs);

    return static_cast<uint64>(ts.tv_sec) * 1000000 + ts.tv_nsec / 1000;
}
#endif // __APPLE__
#endif // _WIN32
/************************************ unix <=**********************************/

/**********************************=> win *************************************/
#ifdef _WIN32
void SleepInMs(uint32 ms) {
    ::Sleep(ms);
}

void SleepInUs(uint32 us) {
    ::LARGE_INTEGER ft;
    ft.QuadPart = -static_cast<int64>(us * 10);  // '-' using relative time

    ::HANDLE timer = ::CreateWaitableTimer(NULL, TRUE, NULL);
    ::SetWaitableTimer(timer, &ft, 0, NULL, NULL, 0);
    ::WaitForSingleObject(timer, INFINITE);
    ::CloseHandle(timer);
}

static inline uint64 GetPerfFrequency() {
    ::LARGE_INTEGER freq;
    ::QueryPerformanceFrequency(&freq);
    return freq.QuadPart;
}

static inline uint64 PerfFrequency() {
    static uint64 xFreq = GetPerfFrequency();
    return xFreq;
}

static inline uint64 PerfCounter() {
    ::LARGE_INTEGER counter;
    ::QueryPerformanceCounter(&counter);
    return counter.QuadPart;
}

uint64 NowInUs() {
    return static_cast<uint64>(
        static_cast<double>(PerfCounter()) * 1000000 / PerfFrequency());
}
#endif // _WIN32

Yet another more complete cross-platform code can be found here.

Another Quick Solution

As you might have noticed, above code is no longer very light-weight. It needs to include Windows header among others things which might not be very desirable if you are developing header-only libraries. If you need sleep less than 2ms and you are not very keen on using OS code then you can just use following simple solution which is cross platform and works very well on my tests. Just remember that you are now not using heavily optimized OS code which might be much better at saving power and managing CPU resources.

typedef std::chrono::high_resolution_clock clock;
template <typename T>
using duration = std::chrono::duration<T>;

static void sleep_for(double dt)
{
    static constexpr duration<double> MinSleepDuration(0);
    clock::time_point start = clock::now();
    while (duration<double>(clock::now() - start).count() < dt) {
        std::this_thread::sleep_for(MinSleepDuration);
    }
}

Related Questions

Solution 3

Don't use spinning here. The requested resolution and accuracy can be reached with standard methods.

You may use Sleep() down to periods of about 1 ms when the systems interrupt period is set to operate at that high frequency. Look at the description of Sleep() to get the details, in particular the multimedia timers with Obtaining and Setting Timer Resolution to get the details on how to set the systems interrupt period. The obtainable accuracy with such an approach is in the few microseconds range when implemented properly.

I suspect your loop is doing something else too. Thus I suspect you want a total period of 5 ms which then would be the sum of the Sleep() and the rest of time you spend on other things in the loop.

For this scenario I'd suggest Waitable Timer Objects, however, these timers also rely on the setting of the multimedia timer API. I've given an overview over the relevant functions for higher precision timing here. Much deeper insight in high precision timing can be found here.

For even more accurate and reliable timing you may have to have a look into process priority classes and thread priorities. Another answer about the Sleep() accuracy is this.

However, whether it is possible to obtain a Sleep() delay of precisely 5 ms depends on the systems hardware. Some systems allow you to operate at 1024 interrupts per second (set by the multimedia timer API). This corresponds to a period of 0.9765625 ms. The nearest you can get thus is 4.8828125 ms. Others allow to get closer, particulary since Windows 7 the timing has improved a significantly when operated on hardware providing high resolution event timers. See About Timers at MSDN and High Precision Event Timer.

Summary: Set the multimedia timer to operate at maximum frequency and use waitable timer.

Solution 4

Rather than using sleep, perhaps, you can try a loop which checks the time interval and returns when the time difference is 5ms. The loop should be more accurate then sleep.

However, be aware that precision is not always possible. The cpu could be tied up with another operation for such a small interval and may miss the 5 ms.

Solution 5

These functions:

let you create a waitable timer with a 100 nano second resolution, wait for it, and have the calling thread execute a specific function at trigger time.

Here's an example of use of said timer.

Note that the WaitForSingleObject has a timeout measured in milliseconds, which could perhaps work as a crude replacement for the wait, but I wouldn't trust it. See this SO question for details.

View more solutions

28,248

Author by

Hooch

Updated on January 26, 2020

Comments

Hooch over 4 years

I have thread that runs loop. I need that loop to be run once every 5ms (1ms error). I know that Sleep() function is not precise.

Do you have any suggestions?

Update. I can't do it other way. At the end of loop I need some kind of Sleep. I don't want to have 100% CPU loaded either.
- David Schwartz over 11 years
  
  This is an XY problem. Whatever you actually need to do, there's probably a way to do it. But this is not the way. (Otherwise, if this really is what you need to do, dedicate a core to that thread, and spin for 5ms. The system can't usefully do other work for that small a period of time.)
- Rook over 11 years
  
  Does the threading system actually give you any sort of time guarantees? In fact, does the OS you're running on even offer that? Unless you're running an operating system that actually gives you any sort of real time guarantees (WinCE presumably does, don't know about other versions of Windows) then you might not ever be able to guarantee that the thread or process scheduler won't just interrupt your task and use the CPU time for something else instead.
- John Dibling over 11 years
  
  "Precise around 1ms" is a bit of an oxymoron.
- Arno over 11 years
  
  @JohnDibling: They were asking for an error of 1 ms for the Sleep() delay. That's not too difficult to obtain. And they also don't use the word around together with the error specification. What's contradictory here?
- John Dibling over 11 years
  
  @Arno: The title specifies the error of 1ms, and the question specifies the duration of 5ms. That's an error of 20%. In my book, that's not very precise.
- Arno over 11 years
  
  @JohnDibling: Yes, agree, title is misleading, I suspect they should have written 5 ms there, the desired delay. Relatively precise though when compared to the typical slices discussed here all over the place.
- Arno over 11 years
  
  @DavidSchwartz: Context switches are typically done in a few ten microseconds. System can't usefully do other work in 5 ms?
- David Schwartz over 11 years
  
  @Amo: Context switches are not the issue, it's blowing out the code and data caches that's the issue.
- Arno over 11 years
  
  @DavidSchwartz: Well, to keep going for the sake of cache and remaining in control of the time slice is a good idea, I agree. But when time matters, it eventually also matters to other threads. So it is at least not clear whether holding the thread running by spinning is any better than relinquishing the reminder of the threads time slice. Caches are huge these days and time critical application are typically not taking a lot of memory, particulary when repeating things at a 5 ms period. I even suggest to use Sleep(0) to improve timing. And spinnging only works reliable at high priority.
- rogerdpack over 4 years
  
  Does this answer your question? How to make thread sleep less than a millisecond on Windows
SingerOfTheFall over 11 years

5ms is not a very small interval, though xD
Kami over 11 years

Yea, maybe I am old school, but it can happen, the processor does something else and misses the 1ms check. It should be tested under load etc if the 1ms requirement is critical.
Rook over 11 years

Indeed; a few threads can be switched in an out in that length of time. blog.tsunanet.net/2010/11/…
Hooch over 11 years

It is an option. But I would like to give a CPU a rest for 5ms.
Hooch over 11 years

I will look into that. Thanks.
Adrian McCarthy over 6 years

Actually, they do not advertise a precision under 1 ms. You have to query for the supported range of periods, and then use timeBeginPeriod to something in that range. Since timeBeginPeriod takes a value in milliseconds, it seems unlikely that you could do better than 1 ms. Oh, and speeding up the system close with timeBeginPeriod has a negative effect on system performance and power usage, so be certain to call timeEndPeriod as soon as you no longer need this precision.
ShadowRanger over 5 years

@AdrianMcCarthy: Except their own docs on "Wait Functions and Time-out Intervals" state that "If you call timeBeginPeriod, call it one time early in the application and be sure to call the timeEndPeriod function at the very end of the application" because "frequent calls can significantly affect the system clock, system power usage, and the scheduler". So if you're depending on this precision for many calls, you shouldn't adjust before and after each call.
ShadowRanger over 5 years

And given that the timeBeginPeriod and timeEndPeriod functions appear to modify OS global state (not just for your own process), and the docs seem to imply that a timeBeginPeriod that isn't matched by a timeEndPeriod isn't fixed even by process death, it seems really easy (e.g. segfaulting or otherwise hard-killing the process while the clock is adjusted) to accidentally end up with the system clock in a suboptimal state permanently (or at least until you reboot). Really bad for anything running on a battery, where the increased power usage hurts. Doesn't seem a good idea in general.
Adrian McCarthy over 5 years

@ShadowRanger: I'm confused. You seem to be agreeing with what I wrote but writing it as though it's a rebuttal.
ShadowRanger over 5 years

@AdrianMcCarthy: I was disagreeing only with "be certain to call timeEndPeriod as soon as you no longer need this precision", because that implies you might use it for fine-grained purposes (speed up clock before sleep, slow it down after), which is explicitly warned against. I'll admit, your phrasing was a little ambiguous (you could mean "when the program will never need that precision again"), so I might have jumped the gun.
Adrian McCarthy over 5 years

@ShadowRanger: I'm still not sure I see the point of contention. Consider an application that needs high precision while running a short animation but only occasionally needs to show such an animation. I would hope that it doesn't increase the clock precision until an animation begins and that it restores the old precision as soon as the animation ends. Keeping the precision high because it might need to show another animation later isn't very friendly.
John Zwinck almost 4 years

You might want std::chrono::steady_clock instead of high_resolution_clock if you care about the sleep duration being at all accurate when the system clock is changed (by a human or by NTP). Otherwise your sleep_for() might sleep a very different amount of time than expected.