OpenVZ containers will not start after a reboot
8,577
After the hardware reboot, suggested solution below helped to launch failed containers one-by-one:
# ploop check -F /vz/private/139/root.hdd/root.hdd
# ploop mount /vz/private/139/root.hdd/DiskDescriptor.xml
# fdisk -l /dev/ploop56824
# e2fsck /dev/ploop56824p1
# vzctl start 139
Dump file /vz/dump/Dump.131 exists, trying to restore from it
Restoring container ...
Unmounting device /dev/ploop56824
Opening delta /vz/private/131/root.hdd/root.hdd
Adding delta dev=/dev/ploop56824 img=/vz/private/131/root.hdd/root.hdd (rw)
Mounting /dev/ploop56824p1 at /vz/root/131 fstype=ext4 data='balloon_ino=12,'
Container is mounted
undump...
Adding IP address(es): 78.129.146.84
Setting CPU limit: 100
Setting CPU units: 1000
Setting CPUs: 1
Setting iolimit: 67108864 bytes/sec
resume...
Container start in progress...
Restoring completed successfully
Thank you, helped a lot!
Related videos on Youtube
Author by
Jason
Updated on September 18, 2022Comments
-
Jason over 1 year
vzctl start 192 Dump file /vz/dump/Dump.192 exists, trying to restore from it Restoring container ... Opening delta /vz/private/192/root.hdd/root.hdd Data cluster 1112 beyond EOF, vsec=47137... FATAL Error in ploop_check (check.c:547): Fatal errors were found, image /vz/private/192/root.hdd/root.hdd is not repaired Error in check_deltas (check.c:631): /vz/private/192/root.hdd/root.hdd : irrecoverable errors Failed to mount image: Error in check_deltas (check.c:631): /vz/private/192/root.hdd/root.hdd : irrecoverable errors [11] Starting container... Opening delta /vz/private/192/root.hdd/root.hdd Data cluster 1112 beyond EOF, vsec=47137... FATAL Error in ploop_check (check.c:547): Fatal errors were found, image /vz/private/192/root.hdd/root.hdd is not repaired Error in check_deltas (check.c:631): /vz/private/192/root.hdd/root.hdd : irrecoverable errors Failed to mount image: Error in check_deltas (check.c:631): /vz/private/192/root.hdd/root.hdd : irrecoverable errors [11]
I ran a ploop check:
# ploop check -f /vz/private/192/root.hdd/root.hdd Reopen rw /vz/private/192/root.hdd/root.hdd Data cluster 4704 beyond EOF, vsec=6177... FATAL Error in ploop_check (check.c:547): Fatal errors were found, image /vz/private/192/root.hdd/root.hdd is not repaired # ploop check -F /vz/private/192/root.hdd/root.hdd Reopen rw /vz/private/192/root.hdd/root.hdd Data cluster 4704 beyond EOF, vsec=6177... Fixed Error in ploop_check (check.c:572): Dirty flag is set # vzctl start 192 Dump file /vz/dump/Dump.192 exists, trying to restore from it Restoring container ... Opening delta /vz/private/192/root.hdd/root.hdd Error in ploop_check (check.c:572): Dirty flag is set Adding delta dev=/dev/ploop46026 img=/vz/private/192/root.hdd/root.hdd (rw) /dev/ploop46026p1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. (i.e., without -a or -p options) Error in e2fsck (fsutils.c:288): e2fsck failed (exit code 4) Failed to mount image: Error in e2fsck (fsutils.c:288): e2fsck failed (exit code 4) [41] Starting container... Opening delta /vz/private/192/root.hdd/root.192 Adding delta dev=/dev/ploop46026 img=/vz/private/200/root.hdd/root.hdd (rw) /dev/ploop46026p1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. (i.e., without -a or -p options) Error in e2fsck (fsutils.c:288): e2fsck failed (exit code 4) Failed to mount image: Error in e2fsck (fsutils.c:288): e2fsck failed (exit code 4) [41]
EDIT 2:
This fixed one of them:
ploop check -F /vz/private/183/root.hdd/root.hdd ploop mount /vz/private/183/root.hdd/DiskDescriptor.xml fdisk -l /dev/ploop32942 e2fsck /dev/ploop32942p1
However, one of them after that I get this on startup:
# vzctl start 183 Dump file /vz/dump/Dump.183 exists, trying to restore from it Restoring container ... Opening delta /vz/private/183/root.hdd/root.hdd Adding delta dev=/dev/ploop32942 img=/vz/private/183/root.hdd/root.hdd (rw) Mounting /dev/ploop32942p1 at /vz/root/183 fstype=ext4 data='balloon_ino=12,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0,' Container is mounted undump... Restore error, undump failedContainer restore failed (try to check kernel messages, e.g. "dmesg | tail") : Invalid argument Starting container... Warning: rmdir //.cpt_hardlink_dir_a920e4ddc233afddc9fb53d26c392319 failed: Read-only file system Adding IP address(es): mkdir: cannot create directory `/etc/sysconfig/network-scripts/bak': Read-only file system /bin/cp: cannot create regular file `/etc/sysconfig/network-scripts/bak/': Is a directory ERROR: Unable to backup interface config files Setting CPU limit: 400 Setting CPU units: 1000 Setting CPUs: 16 Container start failed (try to check kernel messages, e.g. "dmesg | tail") Killing container ... Container was stopped Unmounting file system at /vz/root/183 Unmounting device /dev/ploop32942 Container is unmounted
After that fails when I try and start the container again I have to e2fsck it again but trying to start it again I get the same issue.
EDIT 3:
could the fact be that because the disk is GPT that fsck does not work correctly?
-
Devon about 9 yearsGPT would not prevent fsck. The fsck should be on the partition. With ploop, the partition is almost always p1. Ploop has a high risk of irrecoverable errors during hard reboots which is one reason why I moved away from it.
-
Jason about 9 yearsHow should one reboot the node correctly to try and avoid errors?
-