How to keep time on resumed KVM guest with libvirt?

586

Solution 1

The Problem

I've got the same problem and I haven't found a good solution. Here's what I found:

The problem is that after resume, the system and hardware clock times on the guest are different:

root@guest:~# date; hwclock
Sat Oct 11 13:09:38 UTC 2014
Sat Oct 11 13:10:42 2014  -0.454380 seconds

On the host, they agree:

root@four:~# date; hwclock
Sat Oct 11 13:11:35 UTC 2014
Sat Oct 11 13:11:36 2014  -1.000372 seconds

The solution would be to run hwclock --hctosys on the guest after it's been resumed. However, I haven't found a way to do this with changes on the guest system only, as the guest doesn't notice that it is suspended and resumed.

QEmu Guest Agent

There is the possibility to run a software called QEmu Guest Agent on the guest and notify from the host to update the guest system clock from the guest hardware clock. However, the page mentions that the guest agent makes the host and guest vulnerable to attacks from each other because of issues with a JSON parser (at least I believe that the affected code is also run on the host, I'm not sure about that). Anyway, here's how to set that up:

  1. Set up a virtio serial channel for the agent as mentioned in the libvirt wiki (see also libvirt domain format documentation).

  2. After the serial channel is available, install and start the QEmu Guest Agent on the guest. (Debian: apt-get install --no-install-recommends qemu-guest-agent.)

  3. Trigger the clock offset by suspending, waiting and resuming. Then run the following command on the host to correct it: virsh qemu-agent-command backup '{"execute":"guest-set-time"}' The wiki page that using virsh qemu-agent-command is unsupported, but I haven't found any other command that does the job.

I found two discussions on automating within libvirt the call to guest-set-time on resume from suspend:

However, nothing has been implemented yet as far as I could see.

I found information on how to submit commands to the guest agent on wiki of stoney-cloud.org.

I've also tried setting tickpolicy="catchup" in the libvirt timer configuration but this didn't solve the problem.

NTP

An alternative to using the agent would be to use an ntp daemon or to call ntpdate periodically from a cron job. I wouldn't recommend the latter, as it can cause the time to go backwards, which can confuse programs (for example, the Dovecot IMAP server doesn't try to handle time going backwards and can terminate).

I tried the following ntp daemons:

  • openntpd: Corrects time very slowly at a rate of about 2 seconds per 60 minutes in my test. The time offset was 120 seconds. Also, openntpd throws an error if the time offset is too large and, in my test, completely fails to correct time in that case. Advantages of openntpd: Can run as regular user in chroot.

  • chrony: Corrects a time offset of 120 seconds in 30 minutes in my test. chrony can be configured to run as regular user. chroot support is not implemented. NTP server polling interval can be configured for each NTP server.

  • systemd-timesyncd: Corrects a time offset of 120 seconds in 30 seconds in my test. Runs as regular user by default. However, the polling interval of NTP servers increases up to 2048 seconds, so that a suspend/resume wouldn't be detected until 34 minutes after the resume in the worst case. This does not seem to be configurable. Also, I've observed timesyncd step the time backwards, which causes the same problems as calling ntpdate in a cron (see above).

chrony solves the problem. Openntpd isn't suitable because its correction rate is too low and doesn't seem to be configurable. systemd-timesyncd doesn't entirely solve the problem either, because its polling interval is not configurable.

I tested the following Debian versions of the NTP daemons: openntpd 20080406p-10, chrony 1.30-1 and systemd 215-5+b1.

Solution 2

Many virtualization host operations on the guest can result in a pause - resume. This will negatively effect the system clock at the guest. For example, cloning a VM results in a pause while cloning. The guest clock afterwards is behind. To get NTP to sync the clock you need to then restart the guest--not a good solution in all cases for sure. Alternative you could just restart ntpd in the guest, but that is not optimal either. Ideally there needs to be an event (VM resumed) available that you could optionally use for this type of correction to the guest.

After spending some time researching this I decided to use the host clock directly as a reference for the CentOS 7 guest OS system clock.

Instead of running ntpd in the guest, I decided that every 15 minutes I would set, via crontab, the guest system clock from the guest's hardware clock. The guest's hardware clock reflects the time at the virtualization host which is controlled via ntpd running on the virtualization host. This provides me reliable time in the guest OS. Worst case, the clock may be off for up to 15 minutes before it gets syncronized to the proper time after resuming the guest.

# crontab -e

0,15,30,45 * * * * /sbin/hwclock --hctosys

It would be much better to have an event available at the guest that would initiate a time syncronization when the guest was resumed, but apparently that is not available. The crontab approach is a workaround in that it makes a hwclock call every 15 minutes. It gets the job done, but not as elegantly as I would like.

Solution 3

libvirt supports guest time sync since 2015. On Debian Stretch and later look for option SYNC_TIME in /etc/default/libvirt-guests:

# If non-zero, try to sync guest time on domain resume. Be aware, that
# this requires guest agent with support for time synchronization
# running in the guest. For instance, qemu-ga doesn't support guest time
# synchronization on Windows guests, but Linux ones. By default, this
# functionality is turned off.
#SYNC_TIME=1

You can test time sync from within host system with:

virsh qemu-agent-command INSERT_YOUR_DOMAIN_HERE '{"execute":"guest-set-time"}'

This command should return {"return":{}} on success.

Solution 4

kvm-clock syncs the guest time to host time on guest startup. You should use and ntp client in the guest, and shutdown/startup instead of using suspend/resume.

Solution 5

Since Linux Kernel 4.11 there is PTP-KVM (Precision Time Protocol - Kernel Virtual Machine): PTP is comparable to NTP, but with higher precision and for local network. The Linux host system exports its current time to any guest system for efficient reading as /dev/ptp_kvm (or /dev/ptp0). chrony is capable of using that inside the VM:

modprobe ptp_kvm
echo ptp_kvm >/etc/modules-load.d/ptp_kvm.conf
apt-get install chrony
echo "refclock PHC /dev/ptp0 poll 2" >/etc/chrony/conf.d/ptp.conf
systemctl restart chronyd

As a requisite the time on the host must be synchronized too.

You might want to also configure makestep 10 - depending on which difference you are willing to tolerate: for this example a difference below 10s would be adjusted by speeding up the VM clock, while larger differences would we stepped with all (bad) consequences.

I found this in the RedHat documentation, so kudos to them.

Share:
586

Related videos on Youtube

user41854
Author by

user41854

Updated on September 18, 2022

Comments

  • user41854
    user41854 almost 2 years

    Can someone please help me spot the error here? I can't seem to figure out what's wrong with this main code. I'll give out the secondary one in order to supplement the details. The error I'm receiving is:

    java.lang.StackOverflowError:
    null
    

    Main code:

    public class Line
    {
    private int x1, y1, x2, y2;
    private double Slope;
    
    public Line(int a1, int b1, int a2, int b2)
    {
        Line test = new Line(a1, b1, a2, b2);
    }
    
    public void setCoordinates(int a1, int b1, int a2, int b2)
    {
       a1=x1;
       b1=y2;
       a2=x2;
       b2=y2;   
    }
    
    public void calculateSlope( )
    {
        Slope = (x2-x1)/(y2-y1);
        Slope = (double)Slope;   
    }
    
    public void printSlope( )
    {
        System.out.printf("The slope is %.2f" , Slope);
    }
    }
    

    Secondary code:

    public class LineRunner
    
    {
     public static void main( String[] args )
      {
    
        Line test = new Line(1, 9, 10, 11);
        test.calculateSlope();
        test.printSlope();
    
        test= new Line (1, 7, 18, 3);
        test.calculateSlope();
        test.printSlope();
    
        test = new Line(6, 4, 2, 2);
        test.calculateSlope();
        test.printSlope();
    
        test = new Line(4, 4, 5, 3);
        test.calculateSlope();
        test.printSlope();
    
        test = new Line(1, 1, 2, 9);
        test.calculateSlope();
        test.printSlope();
    
      }
    }
    
    • Hovercraft Full Of Eels
      Hovercraft Full Of Eels over 8 years
      Please get rid of all that unnecessary and distracting white space in your posted code. One blank line is more than enough in any one place.
    • Teepeemm
      Teepeemm over 8 years
      That isn't the error you're getting. Java errors come with a wall of information. Give us that wall, or at least the start of that wall.
    • Socratic Phoenix
      Socratic Phoenix over 8 years
      Although the given answers are correct, I'd like to point out that your 'calculateSlope' method will do Integer division, and as such not result in a precise number, as you seem to want. Instead of assigning the slope variable twice, assigning it to "(x2-x1)/(double)(y2-y1)" should give you a more precise number.
  • Hristo Hristov
    Hristo Hristov over 12 years
    Yes, I can confirm it syncs on startup, because when I do shutdown/startup of the guest everything is fine. Using ntp is not a solution for many reasons (it is a workaround, it panics when the time difference is huge, it requires access to time server). I am searching for a way to solve the problem with suspend/resume, because this is an interesting, nice and default option in libvirt.
  • David Corsalini
    David Corsalini over 12 years
    suspend is 1) migrate VM state to file and 2) destroy. When you resume from suspend, the VM state is restored (migrated from file back to VM memory). This state will include the current timestamp. So yes, it is default, but no, timing still matters, and the time has to come from somewhere, and this is where NTP should come in. I doubt another clock source will help but you can try with acpi_pm.
  • Brian Cain
    Brian Cain over 12 years
  • David Corsalini
    David Corsalini over 12 years
    @Brian Cain this is highly arguable, especially with no explanation or reasoning behind the statement. To provide a profflink: docs.redhat.com/docs/en-US/…
  • user41854
    user41854 over 8 years
    Thank you so much. Anyway, I'm just wondering is there an alternative to the keyword, "this"?
  • Andreas
    Andreas over 8 years
    Alternative is not using it, like you did, and that leads to errors like you had. Always using this is a good idea, because it helps readers of your code (and yourself) to understand that you are referring to an instance field, and not a local variable/parameter.
  • Naitsirk
    Naitsirk over 3 years
    Since systemd release 236 (from December 2017), the timesyncd polling interval can be configured using PollIntervalMinSec and PollIntervalMaxSec. I can confirm that with PollIntervalMaxSec=60, resumed VMs pick up the current time pretty quickly.
  • sampi
    sampi over 3 years
    On at least Debian Bullseye this is not sufficient. A channel also needs to be configured for each guest, as described in the answer by Milan.