KVM virtual machine won't start and gives the following error

centos kvm-virtualization centos7 lvm

6,538

Solution 1

Since your VM has failed to mount root prior to a root pivot from your initramfs to your system stored on "/root", this is very likely a problem with your initial ramdisk being unable to find the root filesystem - either because it isn't there, the initramfs doesn't have the tools required to mount it, or because you have some kind of volume or boot configuration error.

Given the evidence provided, I would assert that this is likely a problem at the guest OS level, or you're missing a disk that used to be attached to this VM that had something to do with the root filesystem. For example, if you're using LVM on the guest itself with multiple virtual disk members.

It's also possible that an update affected your boot time arguments or the packages installed to your initramfs. Usually the only time that happens is if you upgrade the kernel. Other than that, changes to your bootloader configuration or sometimes changing your hostname or volume names on an LVM backed machine.

In addition, it's entirely possible that a genuine filesystem failure has caused this to happen, either on the host or the guest. If it's not configured to be checked regularly, you might consider checking underlying disks, filesystems, and guest filesystems as an easy troubleshooting step.

Solution 2

You need to understand that there is some kind of damage to the virtual disk image, not the physical disk. I would try the following two things:

qemu-img check on the image, to verify it has no image level damage
Boot the VM with a liveCD ISO attached and examine your virtual disks, at the very least, the latter might help with recovering the data in case you need to rebuild that machine.

Solution 3

List of all partitions: is supposed to be followed by a list of all partitions (obviously). See for example Linux 3.10 init/do_mounts.c:418, part of mount_block_root():

      printk("List of all partitions:\n");
      printk_all_partitions();
      printk("No filesystem could mount root, tried: ");
      for (p = fs_names; *p; p += strlen(p)+1)
              printk(" %s", p);
      printk("\n");
#ifdef CONFIG_BLOCK
      __bdevname(ROOT_DEV, b);
#endif
      panic("VFS: Unable to mount root fs on %s", b);

(The for (p = fs_names; ...) printk(...); loop is simply a C-fancy way to print a zero-terminated list of zero-terminated strings. I could explain how it works, but that would take several paragraphs and is unrelated to your problem.)

Also notice Unable to mount root fs on unknown-block(0,0) in the kernel panic message. The unknown-block(0,0) part is certainly a red flag.

printk_all_partitions() is defined in Linux 3.10 block/genhd.c:738 and basically just prints a list of all disk partitions known to the kernel at that point.

We can conclude that something has caused your KVM VM to lose its disk(s). (That kind of thing can also cause a system to experience all kinds of incorrect behavior, including the freezing that you experienced.) Investigate and fix that and your VM should come back to life, assuming that the data is still fine.

6,538

Sergiu Mihuleac

Updated on September 18, 2022

Comments

Sergiu Mihuleac over 1 year

I have a kvm virtual machine with a web-server. Yesterday, without doing any modification on the server or the host, this virtual machine stopped responding. As far i can see it freezes at boot time, while permanently using 100% of one core cpu. The VM has 2 cpu's allocated, but it only consumes one, as shown in HTOP on the host.

The only way to shut it down is by doing a virsh destroy. And at boot time as far i can see the VM gets stuck at boot time with the following problem:

[    1.201865] List of all partitions:
[    1.202927] No filesystem could mount root, tried: 
[    1.204415] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[    1.206842] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.10.0-514.2.2.el7.x86_64 #1
[    1.209032] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[    1.210684]  ffffffff818b4340 00000000de564bbf ffff880139abfd60 ffffffff816861cc
[    1.212973]  ffff880139abfde0 ffffffff8167f5d3 ffffffff00000010 ffff880139abfdf0
[    1.215219]  ffff880139abfd90 00000000de564bbf 00000000de564bbf ffff880139abfe00
[    1.217494] Call Trace:
[    1.218221]  [<ffffffff816861cc>] dump_stack+0x19/0x1b
[    1.219712]  [<ffffffff8167f5d3>] panic+0xe3/0x1f2
[    1.221149]  [<ffffffff81b0a602>] mount_block_root+0x2a1/0x2b0
[    1.222849]  [<ffffffff81b0a664>] mount_root+0x53/0x56
[    1.224358]  [<ffffffff81b0a7a3>] prepare_namespace+0x13c/0x174
[    1.226092]  [<ffffffff81b0a270>] kernel_init_freeable+0x1f5/0x21c
[    1.227882]  [<ffffffff81b099db>] ? initcall_blacklist+0xb0/0xb0
[    1.229656]  [<ffffffff81674630>] ? rest_init+0x80/0x80
[    1.231220]  [<ffffffff8167463e>] kernel_init+0xe/0xf0
[    1.232716]  [<ffffffff81696718>] ret_from_fork+0x58/0x90
[    1.234304]  [<ffffffff81674630>] ? rest_init+0x80/0x80

As far as i understand there is a problem with mounting the root filesytem. I don't know what can i do in this situation and what caused the problem. If you have any information that can be relevant to my situation, please tell me.

I'm using lvm storage on CentOS 7: Linux srv1.host.ro 3.10.0-514.6.2.el7.x86_64 #1 SMP Thu Feb 23 03:04:39 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

I also checked the smart on the primary hdd where the virtual machine is stored and it looks good:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   175   174   021    Pre-fail  Always       -       2250
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       66
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   078   078   000    Old_age   Always       -       16090
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       66
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       56
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       517
194 Temperature_Celsius     0x0022   108   097   000    Old_age   Always       -       35
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

The last libvirt entry logs are:

qemu: terminating on signal 15 from pid 1138
2017-02-23 09:15:02.440+0000: shutting down
2017-02-23 09:15:21.080+0000: starting up libvirt version: 2.0.0, package: 10.el7_3.4 (CentOS BuildSystem <http://bugs.centos.org>, 2017-01-17-23:37:48, c1bm.rdu2.centos.org), qemu version: 1.5.3 (qemu-kvm-1.5.3-126.el7_3.3), hostname: srv1.host.ro
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name Nginx -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu Penryn -m 4096 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid be91646a-f5d3-494d-8a51-e9e598bfdf52 -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-2-Nginx/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -drive file=/dev/Storage/Nginx,format=raw,if=none,id=drive-ide0-0-0,cache=none,aio=native -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive if=none,id=drive-ide0-0-1,readonly=on -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -netdev tap,fd=26,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:2c:e1:cd,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on
char device redirected to /dev/pts/0 (label charserial0)

Sergiu Mihuleac about 7 years

Could a fail to mount a nfs path that is in fstab generate this behavior, or is it strictly bounded to the system partitions?