Risk of starting NTP on database server?

debian postgresql time ntp

5,062

Solution 1

Databases don't like backward steps in time, so you don't want to start with the default behavior of jumping the time. Adding the -x option to the command line will slew the time if the offset is less than 600 seconds (10 minutes). At maximum slew rate it will take about a day and half to adjust the clock by a minute. This is a slow but safe way to adjust the time.

Before running ntp to adjust the time, you may want start ntp with a option like -g 2 to verify how large an offset it is detecting. This will set the panic offset to 2 seconds which should be relatively safe.

An alternative option I have used before this option was available was to write a loop that reset the clock back part of second every minute or so. If you check to ensure the reset won't change the second this is likely safe. If you use timestamps heavily, you may have out of sequence records.

A common option is to shutdown the server long enough that there is no backward movement of the clock. ntp or ntpdate can be configured to jump the clock to the correct time at start up. This should be done before the database is started.

Solution 2

Databases can be especially vulnerable to system time changes if they are very active and have timestamps on internal records. In general, if you're time is behind, you'll have much fewer problems if you suddenly jump forward than if you're ahead and suddenly jump backwards.

As Joffrey points out - it's much more often the application that has issues with sudden time jumps than the database itself. The safest way to correct the time is to shut down the application for N+1 minutes (where N is the number of minutes your system clock is ahead) and then sync time, start NTP, and restart the application. If you can't take that much downtime in the application, I can only suggest you take a backup of the database before syncing time, then offer up a dead squirrel to the goda of computerdom and just pull the trigger. Ok, I'm being a bit facetious, but I can't think of any other "safe" way than taking an application outage.

Solution 3

It is usually not the database server which is vulnerable to error when an instant time leap occurs: its the applications that use the time that are.

There are generally two ways of tracking time: own time tracking or comparing system time. Both have some positive and negative tradeoffs.

Own time tracking

I see this used in some embedded programming and systems where exact timing is not that critical. In a main application loop a way of tracking a 'tick' is taken care of. This could be an alarm given by the kernel, sleep or select that gives an indication of the amount of time passed. When you know what time is passed you know you can add or subtract this time to a counter. This counter is what makes your timing application happen. For example, if the counter is higher than 10 seconds you can discard something, or you need to do something.

If the application does not keep track of time, the counter will not change. This could be desired depending on the design of your application. For example, keeping track on how long a long-running process is taking something is handled is easier with a counter than a list of start/stop timestamps.

Pro:

Not dependent on system clock
Will not break on a big time skew
No costly system call
Small counters will cost less memory than a full timestamp

Con:

Time is not very accurate
Change in system time could make it even more inaccurate
Timing is relative to running the application, does not persist

Comparing system time

This is the system used more often: store a timestamp and compare it with the timestamp using a system time call. Huge skews in the system time could threaten the integrity of your application, a task of a few seconds could take hours or end immediately depending on the direction of the clock.

Pro:

Accurate time comparison
Persists over restarts and long outages

Con:

Takes a system call to get a fresh timestamp to compare with other timestamps
Application needs to be aware of skews or can break

Affected systems

Most of the applications will use timestamp comparing to schedule tasks. For database systems that could be cache cleanups.

All applications that use a database and call time functions in the query language will be affected by skews if the application does not detect and handle accordingly. Applications could never stop running or allow indefinite login periods depending on its purpose.

Mail systems will use timestamps and/or timeouts for handling stale or undelivered mails. A clock skew could affect that but with a much lesser impact. Back-off timers regarding reconnecting to servers could be missed resulting in penalties on the connecting server.

I do not think (have not researched) that kernel alarms will go off when changing the system time. Systems that use these could be safe.

Solutions

Gently move time. This can be found in documentation of your favorite time solution.

5,062

vastlysuperiorman

Updated on September 18, 2022

Comments

vastlysuperiorman over 1 year

I've heard rumors of bad things happening to database and mail servers if you change the system time while they are running. However, I'm having a hard time finding any concrete information on actual risks.

I have a production Postgres 9.3 server running on a Debian Wheezy host and the time is off by 367 seconds. Can I just run ntpdate or start openntp while Postgres is running, or is that likely to cause an issue? If so, what is a safer method of correcting the time?

Are there other services that are more sensitive to a change in system time? Maybe mail servers (exim, sendmail, etc) or message queues (activemq, rabbitmq, zeromq, etc)?
vastlysuperiorman about 9 years

I'm ahead and needing to jump backwards by about 6 minutes. I have many, many internal records that were set with now(). Can you add any safe method of changing the time to your answer?
Jonathan J about 9 years

If ntpd is installed and configured correctly, it should be able to gradually correct the system time by slowing down the clock. Once the correct time is achieved, drift is adjusted to maintain time. You may need to specify a maximum correction in excess of your error. At least that's the way I understand it, but I'm not an NTP expert.
John about 9 years

@JonathanJ - NTP has difficulty correcting time skews greater than 5 minutes, and when set up per "standard" documentstion (of which there are several sets, admittedly) first syncs the time in one jump then maintains sync by adjusting drift.
Joffrey about 9 years

@John I ran out of squirrels years ago ;)
vastlysuperiorman about 9 years

This is a great response, and I appreciate learning more about time keeping. I did not select it because it did not provide a clear solution my present concern of adjusting time on my production database server. +1 for teaching me things.