How to set up local ntp server without internet access on ubuntu?

48,340

Don't do this. Seriously. Just don't. People keep coming up with the idea that NTP is designed to allow a bunch of machines all to have the same time. It isn't. It's designed, quite carefully, to allow many machines to all have the closest thing they can to the correct time, which is not the same thing.

If you have access to a window, you can build a half-decent stratum-1 server for about £50, or a good one for £100. You would do much better to build something like that, then point the other clients at it. Correct timestamps are much better than merely self-consistent ones, not least for forensics.

But if you absolutely must do what you're doing, then you need to realise that you're perverting ntpd, and this will mean understanding what you're doing.

On the server

server 127.127.1.0 prefer
fudge  127.127.1.0 stratum 10

means "use the local undisciplined clock as if it were authoritative", which is what you want. I'm not sure why you're forcing it to stratum 10, though; consider dropping the stratum 10, and let the driver supply its default stratum of 0. On the clients

server 192.168.178.24 iburst
fudge 192.168.178.24  stratum 10

makes no sense at all. fudge 127.127.x.y is reserved for forcing the use of various kinds of local clock drivers. It makes no sense to give it any other address. Drop the fudge line from the clients, and just point them at the server. You're also using a closed network, so drop all the security stuff until you get this working:

restrict default

If that still doesn't seem to work, we'll need to see the output of ntpq -c as and ntpq -c pe on both the server, and on a badly-behaving client, after at least ten minutes of uninterrupted running.

Edit: you write in a comment below that "I think the offset/jitter is really high because the failing clients drift in time".

I think you may be right. This chap's blog suggests he had the same experience: that the client clock was so bad that it fooled the local ntpd into thinking that the server was unreliable. He wrote

the reason for the huge jitter finally seems clear! Our clock drifts so fast that the offset will go up by several seconds through our few measurements

Given that it's your clients whose time goes most quickly off which are failing to sync (marking the server as "reject"), I think you're seeing the same effect. His solution was to use adjtimex to manually tune the kernel clock (adjusting the tick value) until the system clock was less wayward, at which point ntpd had a chance to recognise the server as being OK, and sync to it. You should probably give that a try on the worst client first, and see if it helps.

Share:
48,340

Related videos on Youtube

j9dy
Author by

j9dy

Updated on September 18, 2022

Comments

  • j9dy
    j9dy over 1 year

    I have tried several guides on how to set up a local ntp server on ubuntu but none seem to work correctly. My servers are drifting heavily in time for some reason and I have to keep their time close together because I run databases that require this.

    • I have 8 ubuntu 14.04 LTS servers, none of them has internet access
    • I want to run a ntp server on one (or more if that is better) of the servers and have all other servers connect to the ntp server(s) to set the time

    Currently, my server (ip .24) runs this /etc/ntp.conf:

    server 127.127.1.0 prefer
    fudge  127.127.1.0 stratum 10
    driftfile /var/lib/ntp/drift
    broadcastdelay 0.008
    
    # Give localhost full access rights
    restrict 127.0.0.1
    
    # Give machines on our network access to query us
    restrict 192.168.178.0 mask 255.255.255.0 nomodify notrap
    
    broadcast 192.168.178.0
    

    And on the "clients":

    # Point to our network's master time server
    server 192.168.178.24 iburst
    fudge 192.168.178.24  stratum 10
    
    restrict default ignore
    restrict ::1
    restrict 127.0.0.1
    restrict 192.168.178.24 mask 255.255.255.255 nomodify notrap noquery
    
    driftfile /var/lib/ntp/drift
    
    minpoll 4
    maxpoll 5
    

    Note: I have used Multi-Tabbed Putty to send the following commands to all ntp clients at the same time. I have stopped the ntp services for all except the server, used sudo ntpdate 192.168.178.24 to let them fetch the date and restarted the ntp services afterwards. This succeeded. All servers showed the same date straight after the command finished. After about 10 minutes however, my servers show the following time:

    Fr 30. Sep 11:16:53 CEST 2016
    Fr 30. Sep 11:15:33 CEST 2016 (server .24) 
    Fr 30. Sep 11:16:50 CEST 2016
    Fr 30. Sep 11:15:33 CEST 2016
    Fr 30. Sep 11:17:05 CEST 2016
    Fr 30. Sep 11:15:33 CEST 2016
    Fr 30. Sep 11:15:33 CEST 2016
    Fr 30. Sep 11:15:33 CEST 2016
    

    How to have them properly sync to the ntp server? And how can I lower the polling time? It looks like my servers are running out of sync fast so I need them to retrieve the "correct" time again...

    With "correct" time I mean a time that is the same for all servers. It does not necessarily need to be the exact correct world time (if you call it like that).


    Edit: I have tried the suggested configuration setting. As far as I understood, this is how my server/client configs should look like. In the meantime, I have seen that my .24 server is actually drifting to a worse time. The .20 server is the most accurate one and I am using the .20 server now to host the ntp server. Sorry for the confusion.

    Server config:

    # Use the local clock
    server 127.127.1.0 prefer
    fudge  127.127.1.0
    driftfile /var/lib/ntp/drift
    broadcastdelay 0.008
    
    # Give localhost full access rights
    restrict default
    
    # Give machines on our network access to query us
    restrict 192.168.178.0 mask 255.255.255.0 nomodify notrap
    
    broadcast 192.168.178.0
    

    For the clients:

    # Point to our network's master time server
    server 192.168.178.20 iburst
    
    restrict default
    
    driftfile /var/lib/ntp/drift
    
    minpoll 4
    maxpoll 5
    

    ntpq -as and ntpq -pe on the server:

    ntpq -c as
    ind assid status  conf reach auth condition  last_event cnt
    ===========================================================
      1 41906  963a   yes   yes  none  sys.peer    sys_peer  3
      2 41907  8811   yes  none  none    reject    mobilize  1
    
    ntpq -c pe
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
    *LOCAL(0)        .LOCL.           5 l   60   64  377    0.000    0.000   0.000
     192.168.178.0   .BCST.          16 u    -   64    0    0.000    0.000   0.000
    

    Five times similar output like this (these servers drift in time):

    ntpq -c as
    ind assid status  conf reach auth condition  last_event cnt
    ===========================================================
      1 62104  9024   yes   yes  none    reject   reachable  2
    
    
    ntpq -c pe
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
     hadoop20.xx LOCAL(0)         6 u   27   64  377    0.151  63591.8 33407.0
    

    For two (most likely?) working clients:

    ntpq -c as
    ind assid status  conf reach auth condition  last_event cnt
    ===========================================================
      1  7757  963a   yes   yes  none  sys.peer    sys_peer  3
    
    ntpq -c pe
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
    *hadoop20.xx LOCAL(0)         6 u   18   64  377    0.183    7.883   3.015
    

    edit 2:

    I have used sudo service ntp stop, sudo ntpdate 192.168.178.20, wait for ntpdate to finish, sudo service ntp start on all clients. There are still only 2 succeeding clients and 5 rejecting clients.

    The rejecting clients show this output. The delay + offset values look high because the failing clients drift in time. Maybe they are not trusting the server to update the time because the delay/offset is so high?

    ntpq -c as
    ind assid status  conf reach auth condition  last_event cnt
    ===========================================================
      1 20981  905a   yes   yes  none    reject    sys_peer  5
    
    ntpq -c pe
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
     hadoop20.xx LOCAL(0)         6 u   34   64    3    0.166  18665.9 16201.3
    

    I have also tried using this https://askubuntu.com/a/256004 answer, it works for about 30 seconds then the state changes to "reject" again! Same for ntpdate -s 192.168.178.20. It is most likely related to the ntp clients rejecting the time of the server. Is there a way to FORCE them to change the time?

  • j9dy
    j9dy over 7 years
    I've edited my original question. It looks like two clients were able to connect to the server now, but 5 could not. At least that's what I can tell from the output of ntpq -c as and ntpq -c pe
  • MadHatter
    MadHatter over 7 years
    Looks like it's not a firewall problem, as even the refusing clients can see that the server's at stratum 6. Does ntpd syslog anything useful on a refusing client, so we can get some idea of why they're rejecting the server? Also, id you do the ntpdate first? Plus, the cnt=2 on the refusing output above is worrying; you did wait ten minutes as asked, yes?
  • j9dy
    j9dy over 7 years
    I have just used the commands again - I think it was 10 minutes already the last time but now it is for sure. For the failing servers, the cnt=2 remains, it has not changed. I have not restarted meanwhile. I will stop all client ntp services now, use ntpdate 192.168.178.20 on the clients and then restart the ntp service on the clients. cat /var/log/syslog | grep ntp has not given any output for the last hour on the failing clients. Any idea? what about the minpolland maxpoll in the client config?
  • j9dy
    j9dy over 7 years
    I have added more output to the original question. I think the offset/jitter is really high because the failing clients drift in time. Maybe they do not trust the time of the server?
  • MadHatter
    MadHatter over 7 years
    My feeling is that you really need to get ntpd on the client to tell you what's going on. You'll need to check your (r)syslog config, find out where ntpd is logging and why (on my system (CentOS6) it uses facility daemon and severities 5, 6, and 7). Also, see my edit above.
  • j9dy
    j9dy over 7 years
    Following up on your edit: Installing package adjtimex solved the problem on its own! The installation printed stuff like: Comparing clocks (this will take 70 sec)...done. Adjusting system time by -14.5741 sec/day to agree with CMOS clock...done.. After it finished, sudo service ntp stop, sudo ntpdate 192.168.178.20, sudo service ntp start has solved it!