LVM not coming up after reboot, couldn't find device with uuid

37,680

If I understood correctly, you have already fixed the volume, even though you have a lost+found directory which may or may not have critical files.

What is going on now that's blocking the VM from booting? It still can't find the boot device?

Your fdisk -l output seems a bit off to me. Have you considered the possibility that only the partition table was damaged? In this scenario, your snapshot may be helpful, and in the best case you won't even need a(nother) fsck. But we'll need something to try to find the partition offsets - I've used testdisk successfully more than once.

In the worst case scenario, if you need to scrape anything from the volume, forensic tools like PhotoRec or Autopsy/The Sleuth Kit may prove useful.

If none of this works, give us a lsblk -o NAME,RM,SIZE,RO,TYPE,MAJ:MIN -fat too (these flags are just to show as much information as possible), and relevant dmesg output, if any.

Share:
37,680

Related videos on Youtube

Smudge
Author by

Smudge

Software engineer, platform architect, server harassing, coffee drinking, photo taking, music listening dog thing.

Updated on September 18, 2022

Comments

  • Smudge
    Smudge over 1 year

    Had a VM that was, up until recently working without issue, but needed to be rebooted after some configuration changes. However after rebooting the VM didn't come back up, saying it couldn't find the root device (which was an LVM volume under /dev/mapper).

    Booting into recovery mode, I saw that the filesystems under /dev/mapper, and /dev/dm-* did indeed, not exist.

    The filesystem should be layed out with

    • /dev/sda1 as the boot partition
    • /dev/sda2 extended partition containing
    • /dev/sda5 and /dev/sda6 as LVM partitions
    • /dev/sda{5,6} are both PVs in a single VG
    • with 2 LVs for the root FS and swap

    Doing an lvm pvshow gives me:

      Couldn't find device with uuid '8x38hf-mzd7-xTes-y6IV-xRMr-qrNP-0dNnLi'.
      Couldn't find device with uuid '8x38hf-mzd7-xTes-y6IV-xRMr-qrNP-0dNnLi'.
      Couldn't find device with uuid '8x38hf-mzd7-xTes-y6IV-xRMr-qrNP-0dNnLi'.
      --- Physical volume ---
      PV Name               unknown device
      VG Name               of1-server-lucid
      PV Size               19.76 GiB / not usable 2.00 MiB
      Allocatable           yes (but full)
      PE Size               4.00 MiB
      Total PE              5058
      Free PE               0
      Allocated PE          5058
      PV UUID               8x38hf-mzd7-xTes-y6IV-xRMr-qrNP-0dNnLi
    
      --- Physical volume ---
      PV Name               /dev/sda6
      VG Name               of1-server-lucid
      PV Size               100.00 GiB / not usable 2.66 MiB
      Allocatable           yes (but full)
      PE Size               4.00 MiB
      Total PE              25599
      Free PE               0
      Allocated PE          25599
      PV UUID               cuhP6R-QbiO-U7ye-WvXN-ZNq5-cqUs-VVZpux
    

    So it appears as though /dev/sda5 is not listed as a PV and is causing errors.

    fdisk -l:

    Disk /dev/sda: 128.8 GB, 128849018880 bytes
    255 heads, 63 sectors/track, 15665 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00044a6c
    
       Device Boot      Start         End      Blocks   Id  System
    /dev/sda1   *           1          32      248832   83  Linux
    Partition 1 does not end on cylinder boundary.
    /dev/sda2              32       15665   125579256+   5  Extended
    /dev/sda5              32        2611    20722970   8e  Linux LVM
    /dev/sda6            2612       15665   104856223+  8e  Linux LVM
    

    So I can see the /dev/sda5 device exists, but blkid isn't reporting anything for it:

    ~ # blkid
    /dev/sda1: UUID="d997d281-2909-41d3-a835-dba400e7ceec" TYPE="ext2" 
    /dev/sda6: UUID="cuhP6R-QbiO-U7ye-WvXN-ZNq5-cqUs-VVZpux" TYPE="LVM2_member" 
    

    After taking a snapshot of the disks, I tried recovering the PV from the archive config:

    ~ # pvremove -ff /dev/sda5
    Labels on physical volume "/dev/sda5" successfully wiped
    ~ # pvcreate --uuid=8x38hf-mzd7-xTes-y6IV-xRMr-qrNP-0dNnLi /dev/sda5 --restorefile=/etc/lvm/archive/of1-dev-server_00000.vg
    Couldn't find device with uuid '8x38hf-mzd7-xTes-y6IV-xRMr-qrNP-0dNnLi'.
      Physical volume "/dev/sda5" successfully created
    ~ # vgchange -a y
    2 logical volume(s) in volume group "of1-dev-server" now active"
    

    So at least now the device has a blkid:

    /dev/sda1: UUID="d997d281-2909-41d3-a835-dba400e7ceec" TYPE="ext2" 
    /dev/sda6: UUID="cuhP6R-QbiO-U7ye-WvXN-ZNq5-cqUs-VVZpux" TYPE="LVM2_member" 
    /dev/sda5: UUID="8x38hf-mzd7-xTes-y6IV-xRMr-qrNP-0dNnLi" TYPE="LVM2_member" 
    

    Doing a pvdisplay now also shows the correct device:

      --- Physical volume ---
      PV Name               /dev/sda5
      VG Name               of1-dev-danr-lucid
      PV Size               19.76 GiB / not usable 2.00 MiB
      Allocatable           yes (but full)
      PE Size               4.00 MiB
      Total PE              5058
      Free PE               0
      Allocated PE          5058
      PV UUID               8x38hf-mzd7-xTes-y6IV-xRMr-qrNP-0dNnLi
    
      --- Physical volume ---
      PV Name               /dev/sda6
      VG Name               of1-dev-danr-lucid
      PV Size               100.00 GiB / not usable 2.66 MiB
      Allocatable           yes (but full)
      PE Size               4.00 MiB
      Total PE              25599
      Free PE               0
      Allocated PE          25599
      PV UUID               cuhP6R-QbiO-U7ye-WvXN-ZNq5-cqUs-VVZpux
    

    And the mapper devices exist:

    crw-rw----    1 root     root      10,  59 Jul 10 10:47 control
    brw-rw----    1 root     root     252,   0 Jul 10 11:21 of1--dev--server-root
    brw-rw----    1 root     root     252,   1 Jul 10 11:21 of1--dev--server-swap_1
    

    Also the LVMs seem to be listed correctly:

    ~ # lvdisplay
      --- Logical volume ---
      LV Name                /dev/of1-dev-danr-lucid/root
      VG Name                of1-dev-danr-lucid
      LV UUID                pioKjE-iJEp-Uf9S-0MxQ-UR0H-cG9m-5mLJm7
      LV Write Access        read/write
      LV Status              available
      # open                 0
      LV Size                118.89 GiB
      Current LE             30435
      Segments               2
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           252:0
    
      --- Logical volume ---
      LV Name                /dev/of1-dev-danr-lucid/swap_1
      VG Name                of1-dev-danr-lucid
      LV UUID                mIq22L-RHi4-tudV-G6nP-T1e6-UQcS-B9hYUF
      LV Write Access        read/write
      LV Status              available
      # open                 0
      LV Size                888.00 MiB
      Current LE             222
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           252:1
    

    But trying to mount the root device gives me an error:

    ~ # mount /dev/mapper/of1--dev--server-root /mnt2
    mount: mounting /dev/mapper/of1--dev--server-root on /mnt2 failed: Invalid argument
    

    So I tried a disk consistency check:

    ~ # fsck.ext4 -f /dev/mapper/of1--dev--server-root
    e2fsck: Superblock invalid, trying backup blocks...
    e2fsck: Bad magic number in super-block while trying to open /dev/mapper/of1--dev--server-root
    [...]
    

    So I tried another superblock:

    ~ # mke2fs -n /dev/mapper/of1--dev--server-root
    Filesystem label=
    OS type: Linux
    Block size=4096 (log=2)
    Fragment size=4096 (log=2)
    Stride=0 blocks, Stripe width=0 blocks
    7798784 inodes, 31165440 blocks
    1558272 blocks (5.00%) reserved for the super user
    First data block=0
    Maximum filesystem blocks=4294967296
    952 block groups
    32768 blocks per group, 32768 fragments per group
    8192 inodes per group
    Superblock backups stored on blocks: 
            32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
            4096000, 7962624, 11239424, 20480000, 23887872
    ~ # fsck.ext4 -y -b 23887872 /dev/mapper/of1--dev--server-root
    

    Upon which I received ridiculous numbers of errors, the main ones I saw were:

    • Superblock has an invalid journal
    • One or more block group descriptor checksums are invalid.
    • Truncating orphaned inode ()
    • Already cleared block #0 () found in orphaned inode
    • /dev/mapper/of1--dev--server-root contains a filesystem with errors, check forced
    • Resize inode not valid. Recreate
    • Root inode is not a directory.
    • Reserved inode 3 () has invalid mode
    • HTREE directory inode has invalid root node
    • Inode , i_blocks is , should be 0.
    • Unconnected directory inode

    After a lot of messages, it says it's done. Mounting the directory as above works fine, but the directory is empty with a lost+found directory full of files, most just numbers, some have filenames vaguely relating to files that once existed.

    So, how do I bring the VM back up?

    Whenever I see disk errors, my first instinct is to snapshot so things don't get worse, so I have a snapshot from just after reboot when I first saw the error.

    I know the data is there somewhere, as the VM worked without issue until I rebooted. The user can't remember changing anything on the filesystem recently, but it had almost a year of uptime when I rebooted it so all sorts could have happened since then.

    We also, unfortunately, don't have backups as Puppet had been disabled on this node.

    The original OS was Ubuntu Lucid, running on VMWare.

    • Rabin
      Rabin almost 10 years
      I had a slimier problem once, and i was able to fix it by booting to a rescue/live cd and re-assign the uuid over tune2fs /dev/sda5 -U 8x38hf-mzd7-xTes-y6IV-xRMr-qrNP-0dNnLi