How to know during system start when system time becomes correct from NTP

6,558

Solution 1

What I decided I wanted in my solution was a function that would return a promise that resolved when the system time was accurate. I could then call that function upon startup and use its feedback to know when to start recording temperatures again. To do that, I decided to use an ntpClient to get the accurate time myself and compare it to the local system time. When the two are within a desired precision, then I can resolve the promise. When they are not, I set a timer and recheck the time, continuing until eventually the local system time becomes accurate enough. So, far this has worked just fine through several power outages (which is where the problem with inaccurate time was initially discovered).

Here's the code I used:

const Promise = require('bluebird');
const ntpClient = Promise.promisifyAll(require('ntp-client'));
const log = require('./log');

function Decay(startT, maxT, decayAmount, decayTimes) {
    // startT is initial delay (e.g. 5 seconds)
    // maxT is the max delay this ever returns (e.g. 5 minutes)
    // decayAmount is how much to decay when a threshold is crossed (e.g. increase by 0.5)
    // decayTimes is how many invocations should trigger a decayAmount (e.g. every 5 times)

    // example: var d = new Decay(5000, 5*60*1000, .5, 5);
    // each 5 seconds, to a max of 5 minutes, getting 50% longer every 5 invocations

    // make sure decayTimes is at least 1 and not negative
    decayTimes = Math.max(decayTimes, 1);
    var num = 0;
    var currentDelay = startT;
    var start = Date.now();

    this.val = function() {
        var elapsed = Date.now() - start;
        // if evenly divisible by decayTimes, then bump the increment
        if (num !== 0 && num % decayTimes === 0) {
            currentDelay = Math.min(Math.round((1 + decayAmount) * currentDelay), maxT);
        }
        ++num;
        return currentDelay;
    };
}

function checkSystemTime(precision) {
    precision = precision || 5000;
    return ntpClient.getNetworkTimeAsync("pool.ntp.org", 123).then(function(ntpTime) {
        return Math.abs(ntpTime.getTime() - Date.now()) <= precision;
    });
}

function waitForAccurateSystemTime(precision, howLong) {
    var start = Date.now();
    // retry starts every 5 seconds, repeats 5 times, then increases by 50% 
    //   up until longest retry time of once every 15 minutes
    var decay = new Decay(5000, 15*60*1000, .5, 5);
    var errCntr = 0;
    var inaccurateCntr = 0;

    function logRetries() {
        // only log anything if there were more than five consecutive errors
        if (errCntr > 5 || inaccurateCntr > 0) {
            log(7, "Time synchronization issue, errCntr = " + errCntr + ", inaccurateCntr = " + inaccurateCntr);
        }
    }

    return new Promise(function(resolve, reject) {

        function check() {
            checkSystemTime(precision).then(function(accurate) {
                if (accurate) {
                    resolve(true);
                } else {
                    ++inaccurateCntr;
                    again();
                }
            }, again);
        }

        function again() {
            ++errCntr;
            if (errCntr == 10) {
                // only log once here that we're in a retry loop on 10th retry
                // final logging will be done later
                log(7, "In retry loop waiting for system time to agree with ntp server time");
            }
            // if we're only supposed to go for a certain amount of time, then check to see
            // if we exceeded that amount of time.  If not, set timer for next decay() value.
            if (!howLong || Date.now() - start <= howLong) {
                setTimeout(check, decay.val());
            } else {
                var err = "timeout waiting for accurate system time";
                log(7, err);
                reject(err);
            }
        }

        check();
    }).then(function(result) {
        logRetries();
        return result;
    }).catch(function(err) {
        logRetries();
        throw err;
    });
}

module.exports = {
    checkSystemTime: checkSystemTime,
    waitForAccurateSystemTime: waitForAccurateSystemTime,
    Decay: Decay
};

And, I use this like this:

const validTime = require("./valid-time");
validTime.waitForAccurateSystemTime(2 * 60 * 1000, 0).then({
    // start operation here that requires accurate system time
}).catch({
    // abort process, no accurate system time could be found
});

Solution 2

Provided you are using ntpd (from the ntpd package) to keep your clock in sync, it can be configured to step the clock on boot regardless of the time offset. If you're using some other package please advise in your Question.

By default ntpd would only jump the clock if it was less than 1000 seconds, but Debian's implementation provides the -g flag to override the limitation and allow stepping from any offset. (This is good.)

Also, the -x flag will force slewing of the time rather than stepping for intervals of up to 600 seconds; you do not want this set. (The Debian default does not set this flag, which is good.)

Check /etc/default/ntp, which should have just this line setting the flags:

NTPD_OPTS='-g'

Your question has been updated to explain that the logging process starts before time has synchronised, so I would suggest you use ntpstat to identify when the synchronisation is complete.

Unsynchronised

ntpstat; printf "\nexit status %s\n" $?
unsynchronised
   polling server every 8 s

exit status 1

Synchronised

ntpstat; printf "\nexit status %s\n" $?
synchronised to NTP server (203.0.113.22) at stratum 3
   time correct to within 93 ms
   polling server every 1024 s

exit status 0

Busy loop example

until ntpstat; do echo waiting; sleep 30; done; date

If you don't have ntpstat and can't install it, you could probably get some information from ntpq -c sysinfo

ntpq -c sysinfo
associd=0 status=0615 leap_none, sync_ntp, 1 event, clock_sync,
system peer:        server2.contoso.com:123
system peer mode:   client
leap indicator:     00
stratum:            3
log2 precision:     -20
root delay:         26.453
root dispersion:    28.756
reference ID:       203.0.113.22
reference time:     e08863a7.fb7d83bf  Thu, May 16 2019 23:33:11.982
system jitter:      3.227792
clock jitter:       1.178
clock wander:       0.012
broadcast delay:    -50.000
symm. auth. delay:  0.000

Solution 3

Another option may be to parse the output of ntpq -c peers to watch for the stratum to move away from 16.

Solution 4

Every time you reboot your Pi (which takes more than a few seconds), your clock is going to be off for more than what ntp can compensate for by stretching/shortening time (ie. slewing, which only good for correcting a clock that is only slightly off, such as caused by a real time clock being slow or fast by a second or so a day), ntp has to set the clock.

So what might be the easiest is to have a script that starts your temperature measuring program first call ntpdate or equivalent, which sets the date or slews according to how far of the retrieved value is. ntpdate therefore doesn't disrupt things if the clock is already set close to correct by ntp e.g. if you restart via this script without having had a reboot.

Solution 5

Check out the ntp-wait program that comes with NTP. You run it, it waits until your clock is synchronized, and exits. (Or it eventually gives up, and exits with an error.) You can use it to prevent your script from starting until the clock is synchronized.

You can also run something like ntpq -p or ntpq -c rv and parse the output to check your clock's status. Indeed, ntp-wait is a short Perl script doing exactly that.

Share:
6,558

Related videos on Youtube

jfriend00
Author by

jfriend00

Updated on September 18, 2022

Comments

  • jfriend00
    jfriend00 almost 2 years

    I have a Raspberry Pi running Raspbian (Debian derivative) that is recording real-time temperatures. As part of this, I need a clock that is accurate to within a few seconds. In normal operation when the server is up, I understand that the Raspberry Pi regularly connects over the network to an NTP server to make sure it's local clock is reasonably close to the correct time. That level of accuracy is fine for my application. The Raspberry Pi does not have a battery powered clock to keep time when the server is shutdown or depowered so when it first boots up the system time is not correct until it establishes an internet connection and gets the correct time via NTP.

    Where I have found a problem is when there is a power outage and the Raspberry Pi is off for some period of time without power, then powers back up sometime later (say 30 minutes). The Pi boots, my app starts up and starts recording temperatures again, but the clock on the Pi is not correct so it records temperatures with the wrong timestamp. It appears to somehow have saved the last known time and picks up from there (it does not reset to epoch time when it restarts). Eventually, when the LAN that the Pi is on recovers and regains internet connectivity, the Pi will correct its time (via NTP), but before that happens, I have inaccurate timestamps that get recorded.

    I'm trying to figure out what the best course of action is to solve this issue?

    For maintenance reasons, I'd rather not add a battery backed add-on clock (don't want anyone to have to replace a battery as this is essentially an embedded device, not easily user accessible).

    I'm willing to have my app postpone recording temperatures until I know the time has been accurately retrieved from the network, but I don't even know how to detect that state. Anyone know how to know when the time has now been updated from an NTP server and is now correct?

    My app is started by running a script at startup.

    Any other ideas for how to solve this issue?

  • jfriend00
    jfriend00 about 9 years
    I don't understand how this solves my problem. My app is likely starting before internet connectivity has even been established yet.
  • jfriend00
    jfriend00 about 9 years
    The challenge here is that when the Pi reboots after power has been restored, the internet connectivity has likely not yet been established as the internet router takes longer to establish the internet connection than it takes for the Pi to boot. So, inserting ntpdate into the startup sequence before starting my app probably won't have a working internet connection for it to contact an NTP server so it can't do its job yet.
  • Anthon
    Anthon about 9 years
    @jfriend00 ntpdate times out, I am not sure what the default timeout period is, but if you wait for ntpdate to finish and play around with its -t option to vary the timeout, you should be able to get it to work, or alternatively have the startup script retry if ntpdate exits with a non-zero status.
  • jfriend00
    jfriend00 about 9 years
    I can't just guess on timing as it is a complete unknown how long it will be until internet connectivity is restored (it could have its own issues). So, are you saying I could repeatedly call ntpdate until it succeeds and only then would I start my app?
  • Anthon
    Anthon about 9 years
    Yes. ntpdate exit value will be 0 (zero) if it succeeds, non-zero otherwise you can check every X seconds and a maximum of N times until it succeeds and if it doesn't by then take some other action (blink a LED or do something with on a display if you have that).
  • jfriend00
    jfriend00 about 9 years
    It's an unattended, embedded device that has to "do the right thing" without human intervention so whatever solution will have to persistently wait until the correct time is set.
  • Anthon
    Anthon about 9 years
    @jfriend00 In that case don't check for a maximum of N times, but reboot after 5 minutes still not back, online (in case something else went wrong during the boot process. I would normally increase the check interval, but if there is no internet connection that would suffer from repeated checks, there is little reason to make things more complicated than necessary.
  • roaima
    roaima about 9 years
    don't mix ntpdate with a running ntp. You'll upset the slew algorithm and end up with a wildly inaccurate clock
  • roaima
    roaima about 9 years
    @jfriend00 it wasn't clear to me at the time I read your question that your application starts before the network. Please advise (in your Question) how you start it and I'll recommend a way of deferring it until after the network has started.
  • Anthon
    Anthon about 9 years
    @roaima Yes you should not run it along ntp, but during startup you can set the date with nptdate I am not sure about the recent setup on my Linux box, but it used to be that ntpdate was called to set the date-time and then fine tune with ntp. ntpdate will also use the slewing when the time is close. And if this should be a problem, the solution is to start the ntp daemon after ntpdate succesfully exits and then start time temperature application.
  • roaima
    roaima about 9 years
    ntpdate is (annoyingly) deprecated in favour of ntpd -g, which in this kind of situation fortunately does much the same thing.
  • jfriend00
    jfriend00 about 9 years
    It looks like ntpdc -c sysinfo will work on my Raspberry Pi to get similar info without even installing anything else.
  • goldilocks
    goldilocks about 9 years
    "started by running a startup script named run.sh from /etc/profile" -> That is an absurd way to start a process that is supposed to be singular. That script will be run by every login; the fact that there's generally only one does not make it a sane practice. Put even more bluntly: cargo cult strikes again. You need to use the init system, which on Raspbian is either SysV or systemd, but the sane short-cut would be via /etc/rc.local.
  • goldilocks
    goldilocks about 9 years
  • jfriend00
    jfriend00 about 9 years
    @goldilocks - if you care to explain a better place to put the auto-startup, I'm all ears. As I've said before, I barely know enough Linux to get my server configured and running so I'm happy to learn new stuff about it. My startup script checks to see if the process is already running and doesn't attempt to start it if already running. I do not want it to run as root. This is an embedded device with a single purpose so it's not like there's lots of other things going on. The ONLY other thing that ever happens on this device is an occasional remote login from me to do maintenance.
  • jfriend00
    jfriend00 about 9 years
    I don't see how you could possibly auto-correct prior wrong log times. How would you know how much to correct them or which times needed correcting? And, isn't that the hard way to solve the problem, even if you could find a guestimate algorithm? I'm fine with just not starting logging until the clock is correct. Yes, an RTC would be just fine until the bloody battery needed replacing in an embedded device. As I said in my question, that isn't a practical option for this type of device. This needs to run unattended for decades (part of a home automation system).
  • jfriend00
    jfriend00 about 9 years
    What is "stratum"?
  • Milliways
    Milliways about 9 years
    @jfriend00 if you log e.g. every 3 minutes the offset from the corrected time to the previous can be calculated (with ±90 sec precision), but as you stated, if you are happy to discard, this still provides an indication when the discontinuity occurred.
  • Matt Nordhoff
    Matt Nordhoff about 9 years
    @jfriend00 NTP terminology, measuring how many hops you are from a reference clock. E.g. a server plugged in to a GPS device is stratum 1, a client of that server is stratum 2, and so forth. ntpd defaults to calling itself stratum 16 before it's synchronized. In practice synchronized clients are typically around stratum 4.
  • roaima
    roaima about 9 years
    Another solution could be to record timestamps relative from boot, and then write relative to absolute markers once you were sure you had correct time. The relative times could then be post-processed into absolute time.
  • roaima
    roaima about 9 years
    @jfriend00 the stratum is a measure of distance (ie number of hops) from a definitive time source such as an atomic clock or GPS receiver. The worst case (unsynchronised) is 16; the source itself is zero. Typically you would expect a value in the range 2-5 from an Internet time source
  • jfriend00
    jfriend00 about 9 years
    The temperatures are checked every 10 seconds, but logging only occurs when a moving average of the temperature reading exceeds some threshold from the previously logged value.
  • Alexis Wilke
    Alexis Wilke about 3 years
    This is certainly the best answer. You don't reinvent the wheel and if something changes in NTP, the script will be updated and continue to work...