How to set up local ntp server without internet access on ubuntu?
Don't do this. Seriously. Just don't. People keep coming up with the idea that NTP is designed to allow a bunch of machines all to have the same time. It isn't. It's designed, quite carefully, to allow many machines to all have the closest thing they can to the correct time, which is not the same thing.
If you have access to a window, you can build a half-decent stratum-1 server for about £50, or a good one for £100. You would do much better to build something like that, then point the other clients at it. Correct timestamps are much better than merely self-consistent ones, not least for forensics.
But if you absolutely must do what you're doing, then you need to realise that you're perverting ntpd, and this will mean understanding what you're doing.
On the server
server 127.127.1.0 prefer
fudge 127.127.1.0 stratum 10
means "use the local undisciplined clock as if it were authoritative", which is what you want. I'm not sure why you're forcing it to stratum 10, though; consider dropping the stratum 10
, and let the driver supply its default stratum of 0. On the clients
server 192.168.178.24 iburst
fudge 192.168.178.24 stratum 10
makes no sense at all. fudge 127.127.x.y
is reserved for forcing the use of various kinds of local clock drivers. It makes no sense to give it any other address. Drop the fudge
line from the clients, and just point them at the server. You're also using a closed network, so drop all the security stuff until you get this working:
restrict default
If that still doesn't seem to work, we'll need to see the output of ntpq -c as
and ntpq -c pe
on both the server, and on a badly-behaving client, after at least ten minutes of uninterrupted running.
Edit: you write in a comment below that "I think the offset/jitter is really high because the failing clients drift in time".
I think you may be right. This chap's blog suggests he had the same experience: that the client clock was so bad that it fooled the local ntpd
into thinking that the server was unreliable. He wrote
the reason for the huge jitter finally seems clear! Our clock drifts so fast that the offset will go up by several seconds through our few measurements
Given that it's your clients whose time goes most quickly off which are failing to sync (marking the server as "reject"), I think you're seeing the same effect. His solution was to use adjtimex
to manually tune the kernel clock (adjusting the tick
value) until the system clock was less wayward, at which point ntpd had a chance to recognise the server as being OK, and sync to it. You should probably give that a try on the worst client first, and see if it helps.
Related videos on Youtube
j9dy
Updated on September 18, 2022Comments
-
j9dy over 1 year
I have tried several guides on how to set up a local ntp server on ubuntu but none seem to work correctly. My servers are drifting heavily in time for some reason and I have to keep their time close together because I run databases that require this.
- I have 8 ubuntu 14.04 LTS servers, none of them has internet access
- I want to run a ntp server on one (or more if that is better) of the servers and have all other servers connect to the ntp server(s) to set the time
Currently, my server (ip .24) runs this /etc/ntp.conf:
server 127.127.1.0 prefer fudge 127.127.1.0 stratum 10 driftfile /var/lib/ntp/drift broadcastdelay 0.008 # Give localhost full access rights restrict 127.0.0.1 # Give machines on our network access to query us restrict 192.168.178.0 mask 255.255.255.0 nomodify notrap broadcast 192.168.178.0
And on the "clients":
# Point to our network's master time server server 192.168.178.24 iburst fudge 192.168.178.24 stratum 10 restrict default ignore restrict ::1 restrict 127.0.0.1 restrict 192.168.178.24 mask 255.255.255.255 nomodify notrap noquery driftfile /var/lib/ntp/drift minpoll 4 maxpoll 5
Note: I have used Multi-Tabbed Putty to send the following commands to all ntp clients at the same time. I have stopped the ntp services for all except the server, used
sudo ntpdate 192.168.178.24
to let them fetch the date and restarted the ntp services afterwards. This succeeded. All servers showed the same date straight after the command finished. After about 10 minutes however, my servers show the following time:Fr 30. Sep 11:16:53 CEST 2016 Fr 30. Sep 11:15:33 CEST 2016 (server .24) Fr 30. Sep 11:16:50 CEST 2016 Fr 30. Sep 11:15:33 CEST 2016 Fr 30. Sep 11:17:05 CEST 2016 Fr 30. Sep 11:15:33 CEST 2016 Fr 30. Sep 11:15:33 CEST 2016 Fr 30. Sep 11:15:33 CEST 2016
How to have them properly sync to the ntp server? And how can I lower the polling time? It looks like my servers are running out of sync fast so I need them to retrieve the "correct" time again...
With "correct" time I mean a time that is the same for all servers. It does not necessarily need to be the exact correct world time (if you call it like that).
Edit: I have tried the suggested configuration setting. As far as I understood, this is how my server/client configs should look like. In the meantime, I have seen that my .24 server is actually drifting to a worse time. The .20 server is the most accurate one and I am using the .20 server now to host the ntp server. Sorry for the confusion.
Server config:
# Use the local clock server 127.127.1.0 prefer fudge 127.127.1.0 driftfile /var/lib/ntp/drift broadcastdelay 0.008 # Give localhost full access rights restrict default # Give machines on our network access to query us restrict 192.168.178.0 mask 255.255.255.0 nomodify notrap broadcast 192.168.178.0
For the clients:
# Point to our network's master time server server 192.168.178.20 iburst restrict default driftfile /var/lib/ntp/drift minpoll 4 maxpoll 5
ntpq -as and ntpq -pe on the server:
ntpq -c as ind assid status conf reach auth condition last_event cnt =========================================================== 1 41906 963a yes yes none sys.peer sys_peer 3 2 41907 8811 yes none none reject mobilize 1 ntpq -c pe remote refid st t when poll reach delay offset jitter ============================================================================== *LOCAL(0) .LOCL. 5 l 60 64 377 0.000 0.000 0.000 192.168.178.0 .BCST. 16 u - 64 0 0.000 0.000 0.000
Five times similar output like this (these servers drift in time):
ntpq -c as ind assid status conf reach auth condition last_event cnt =========================================================== 1 62104 9024 yes yes none reject reachable 2 ntpq -c pe remote refid st t when poll reach delay offset jitter ============================================================================== hadoop20.xx LOCAL(0) 6 u 27 64 377 0.151 63591.8 33407.0
For two (most likely?) working clients:
ntpq -c as ind assid status conf reach auth condition last_event cnt =========================================================== 1 7757 963a yes yes none sys.peer sys_peer 3 ntpq -c pe remote refid st t when poll reach delay offset jitter ============================================================================== *hadoop20.xx LOCAL(0) 6 u 18 64 377 0.183 7.883 3.015
edit 2:
I have used
sudo service ntp stop
,sudo ntpdate 192.168.178.20
, wait for ntpdate to finish,sudo service ntp start
on all clients. There are still only 2 succeeding clients and 5 rejecting clients.The rejecting clients show this output. The
delay
+offset
values look high because the failing clients drift in time. Maybe they are not trusting the server to update the time because the delay/offset is so high?ntpq -c as ind assid status conf reach auth condition last_event cnt =========================================================== 1 20981 905a yes yes none reject sys_peer 5 ntpq -c pe remote refid st t when poll reach delay offset jitter ============================================================================== hadoop20.xx LOCAL(0) 6 u 34 64 3 0.166 18665.9 16201.3
I have also tried using this https://askubuntu.com/a/256004 answer, it works for about 30 seconds then the state changes to "reject" again! Same for
ntpdate -s 192.168.178.20
. It is most likely related to the ntp clients rejecting the time of the server. Is there a way to FORCE them to change the time? -
j9dy over 7 yearsI've edited my original question. It looks like two clients were able to connect to the server now, but 5 could not. At least that's what I can tell from the output of
ntpq -c as
andntpq -c pe
-
MadHatter over 7 yearsLooks like it's not a firewall problem, as even the refusing clients can see that the server's at stratum 6. Does
ntpd
syslog anything useful on a refusing client, so we can get some idea of why they'rereject
ing the server? Also, id you do thentpdate
first? Plus, thecnt=2
on the refusing output above is worrying; you did wait ten minutes as asked, yes? -
j9dy over 7 yearsI have just used the commands again - I think it was 10 minutes already the last time but now it is for sure. For the failing servers, the
cnt=2
remains, it has not changed. I have not restarted meanwhile. I will stop all client ntp services now, usentpdate 192.168.178.20
on the clients and then restart the ntp service on the clients.cat /var/log/syslog | grep ntp
has not given any output for the last hour on the failing clients. Any idea? what about theminpoll
andmaxpoll
in the client config? -
j9dy over 7 yearsI have added more output to the original question. I think the offset/jitter is really high because the failing clients drift in time. Maybe they do not trust the time of the server?
-
MadHatter over 7 yearsMy feeling is that you really need to get
ntpd
on the client to tell you what's going on. You'll need to check your (r)syslog config, find out where ntpd is logging and why (on my system (CentOS6) it uses facilitydaemon
and severities 5, 6, and 7). Also, see my edit above. -
j9dy over 7 yearsFollowing up on your edit: Installing package
adjtimex
solved the problem on its own! The installation printed stuff like:Comparing clocks (this will take 70 sec)...done. Adjusting system time by -14.5741 sec/day to agree with CMOS clock...done.
. After it finished,sudo service ntp stop
,sudo ntpdate 192.168.178.20
,sudo service ntp start
has solved it!