LVM logical volumes fail to activate on boot after 11.04 to 11.10 live upgrade

5,755

Solution 1

The machine gradually became worse and worse as updates were installed. It was rebooted this morning and didn't boot back up - /home wouldn't activate properly at all. And then dbus errors, etc. etc.

9 hours later, I've done a full reinstall of 11.10 onto those same partitions and they now work fine. Very odd, but this appears to have been the solution.

Thanks ppetraki, I agree with your points on the basis of performance will look at this when the machine is replaced.

Solution 2

Nothing error/warning wise in the logs is standing out. However, I did notice that you've created 3 MDs, all using the same backing store, with essentially 3 different filesystems (ext3,ext4,reiserfs (maybe raw VMware too?)). I wonder if all the competing I/O writeback strategies, combined with multiple locks/queuing to the same backing store is crowding out your attempts to use the adjacent MDs under certain conditions?

This would blowback on the filesystem as a failed journal or writeback and manifest in LVM as a failure to assemble or activate the mapping.

Ideally, you're setup should look like this:

sda [sda1 {SPAN entire disk - 1-2%}]

sdb [sdb1 {SPAN entire disk - 1-2%}]

md1 (RAID1) [sda1 sdb1]

vg [md1]

You can boot directly from a root lvm backed by an md, I do this myself. The reason you don't span the entire disk is if you ever get bad blocks and use the vendor low level repair tool, you can sometimes run out of free blocks and the tool commender's blocks that are in use, which changes the net size of the disk, and breaking your partition table and certainly MD in the process. See MDADM Superblock Recovery

If you still have problems with one MD, instead of 3, then either one of your filesystems or VMware is being naughty and someone's starving out your siblings, or there's a real problem with your backing store which makes everyone else a victim.

Share:
5,755

Related videos on Youtube

nOw2
Author by

nOw2

Updated on September 18, 2022

Comments

  • nOw2
    nOw2 almost 2 years

    I have upgraded my server from 11.04 to 11.10 (64 bit) using sudo do-release-upgrade

    The machine now halts during boot because it can't find certain logical volumes in /mnt. When this happens, I hit "m" to drop down to a root shell, and I see the following (forgive me for inaccuracies, I'm recreating this):

    $ lvs
      LV       VG   Attr   LSize   Origin Snap%  Move Log Copy%  Convert
      audio    vg   -wi--- 372.53g                                      
      home     vg   -wi-ao 186.26g                                      
      swap     vg   -wi-ao   3.72g                                      
    

    The corresponding block devices in /dev are missing for "audio".

    If I run:

    $ vgchange -a y
    $ lvs
      LV       VG   Attr   LSize   Origin Snap%  Move Log Copy%  Convert
      audio    vg   -wi-ao 372.53g                                      
      home     vg   -wi-ao 186.26g                                      
      swap     vg   -wi-ao   3.72g                                      
    

    Then all LVs are activated and the system continues to boot perfectly after exiting from the root maintenance shell.

    What is going on, and how would I set the LVs to always be active on boot up?


    Update to answer questions raised:

    There is one volume group:

    # vgs
      VG   #PV #LV #SN Attr   VSize VFree
      vg     1   6   0 wz--n- 1.68t    0 
    # pvs
      PV         VG   Fmt  Attr PSize PFree
      /dev/md2   vg   lvm2 a-   1.68t    0 
    

    Over a RAID1 MD array consisting of a matched pair of SATA hard disks:

     # cat /proc/mdstat 
     Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
     md2 : active raid1 sda3[0] sdb3[1]
           1806932928 blocks [2/2] [UU]
    
     md1 : active raid1 sda2[0] sdb2[1]
           146484160 blocks [2/2] [UU]
    
     md3 : active raid1 sda4[0] sdb4[1]
           95168 blocks [2/2] [UU]
    
     unused devices: <none>
    

    So:

     # mount
     /dev/md1 on / type ext4 (rw,errors=remount-ro)
     proc on /proc type proc (rw,noexec,nosuid,nodev)
     sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
     fusectl on /sys/fs/fuse/connections type fusectl (rw)
     none on /sys/kernel/debug type debugfs (rw)
     none on /sys/kernel/security type securityfs (rw)
     udev on /dev type devtmpfs (rw,mode=0755)
     devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
     tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
     none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
     none on /run/shm type tmpfs (rw,nosuid,nodev)
     /dev/md3 on /boot type ext3 (rw)
     /dev/mapper/vg-home on /home type reiserfs (rw)
     /dev/mapper/vg-audio on /mnt/audio type reiserfs (rw)
     rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
     nfsd on /proc/fs/nfsd type nfsd (rw)
    

    Update with full lvdisplay - as suspected the partitions that work are first in the list. Can't see anything odd myself. I've included the full list here - this is all my LVM partitions.

    This is from the running machine, if output from the broken state is useful that'll need some time to obtain.

    # lvdisplay 
      --- Logical volume ---
      LV Name                /dev/vg/swap
      VG Name                vg
      LV UUID                INuOTR-gwB8-Z0RW-lGHM-qtRF-Xc7D-Bv43ah
      LV Write Access        read/write
      LV Status              available
      # open                 2
      LV Size                3.72 GiB
      Current LE             953
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           252:0
    
      --- Logical volume ---
      LV Name                /dev/vg/home
      VG Name                vg
      LV UUID                7L34YS-Neh0-V5OL-bFfd-TmO4-8CkV-GwXuRL
      LV Write Access        read/write
      LV Status              available
      # open                 2
      LV Size                186.26 GiB
      Current LE             47683
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           252:1
    
      --- Logical volume ---
      LV Name                /dev/vg/audio
      VG Name                vg
      LV UUID                AX1ZG5-vwyk-mYVl-DBHt-Rgp2-DSwg-oDZlbS
      LV Write Access        read/write
      LV Status              available
      # open                 2
      LV Size                372.53 GiB
      Current LE             95367
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           252:2
    
      --- Logical volume ---
      LV Name                /dev/vg/vmware
      VG Name                vg
      LV UUID                bj0m1h-jndV-GWU8-aePm-gaoo-Q0pE-cWhWj2
      LV Write Access        read/write
      LV Status              available
      # open                 2
      LV Size                372.53 GiB
      Current LE             95367
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           252:3
    
      --- Logical volume ---
      LV Name                /dev/vg/backup
      VG Name                vg
      LV UUID                PHDnjD-8uT8-yHB2-8SBW-d7E1-1Zws-Qx0Tp8
      LV Write Access        read/write
      LV Status              available
      # open                 1
      LV Size                93.13 GiB
      Current LE             23841
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           252:4
    
      --- Logical volume ---
      LV Name                /dev/vg/download
      VG Name                vg
      LV UUID                64Your-pvNG-7EvG-exns-eK9A-vMDD-eozIBM
      LV Write Access        read/write
      LV Status              available
      # open                 2
      LV Size                695.05 GiB
      Current LE             177934
      Segments               2
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           252:5
    

    Update 2012/01/28: a reboot has given an opportunity to look at the machine in its broken state.

    I don't know if relevant, but the machine was shut down cleanly but the file systems were not clean on starting up again.

    # lvs
      LV       VG   Attr   LSize   Origin Snap%  Move Log Copy%  Convert
      audio    vg   -wi--- 372.53g                                      
      backup   vg   -wi---  93.13g                                      
      download vg   -wi--- 695.05g                                      
      home     vg   -wi-ao 186.26g                                      
      swap     vg   -wi-ao   3.72g                                      
      vmware   vg   -wi--- 372.53g
    

    Although perhaps of interest is (note download):

    # lvs --segments
      LV       VG   Attr   #Str Type   SSize  
      audio    vg   -wi---    1 linear 372.53g
      backup   vg   -wi---    1 linear  93.13g
      download vg   -wi---    1 linear 508.79g
      download vg   -wi---    1 linear 186.26g
      home     vg   -wi-ao    1 linear 186.26g
      swap     vg   -wi-ao    1 linear   3.72g
      vmware   vg   -wi---    1 linear 372.53g
    

    More:

    # lvdisplay 
      --- Logical volume ---
      LV Name                /dev/vg/swap
      VG Name                vg
      LV UUID                INuOTR-gwB8-Z0RW-lGHM-qtRF-Xc7D-Bv43ah
      LV Write Access        read/write
      LV Status              available
      # open                 2
      LV Size                3.72 GiB
      Current LE             953
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           252:0
    
      --- Logical volume ---
      LV Name                /dev/vg/home
      VG Name                vg
      LV UUID                7L34YS-Neh0-V5OL-bFfd-TmO4-8CkV-GwXuRL
      LV Write Access        read/write
      LV Status              available
      # open                 2
      LV Size                186.26 GiB
      Current LE             47683
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           252:1
    
      --- Logical volume ---
      LV Name                /dev/vg/audio
      VG Name                vg
      LV UUID                AX1ZG5-vwyk-mYVl-DBHt-Rgp2-DSwg-oDZlbS
      LV Write Access        read/write
      LV Status              NOT available
      LV Size                372.53 GiB
      Current LE             95367
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
    
      --- Logical volume ---
      LV Name                /dev/vg/vmware
      VG Name                vg
      LV UUID                bj0m1h-jndV-GWU8-aePm-gaoo-Q0pE-cWhWj2
      LV Write Access        read/write
      LV Status              NOT available
      LV Size                372.53 GiB
      Current LE             95367
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
    
      --- Logical volume ---
      LV Name                /dev/vg/backup
      VG Name                vg
      LV UUID                PHDnjD-8uT8-yHB2-8SBW-d7E1-1Zws-Qx0Tp8
      LV Write Access        read/write
      LV Status              NOT available
      LV Size                93.13 GiB
      Current LE             23841
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
    
      --- Logical volume ---
      LV Name                /dev/vg/download
      VG Name                vg
      LV UUID                64Your-pvNG-7EvG-exns-eK9A-vMDD-eozIBM
      LV Write Access        read/write
      LV Status              NOT available
      LV Size                695.05 GiB
      Current LE             177934
      Segments               2
      Allocation             inherit
      Read ahead sectors     auto
    
    
    # vgs
      VG   #PV #LV #SN Attr   VSize VFree
      vg     1   6   0 wz--n- 1.68t    0 
    
    # pvs
      PV         VG   Fmt  Attr PSize PFree
      /dev/md2   vg   lvm2 a-   1.68t    0 
    

    From dmesg:

    [    0.908322] ata3.00: ATA-8: ST32000542AS, CC34, max UDMA/133
    [    0.908325] ata3.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 0/32)
    [    0.908536] ata3.01: ATA-8: ST32000542AS, CC34, max UDMA/133
    [    0.908538] ata3.01: 3907029168 sectors, multi 16: LBA48 NCQ (depth 0/32)
    [    0.924307] ata3.00: configured for UDMA/133
    [    0.940315] ata3.01: configured for UDMA/133
    [    0.940408] scsi 2:0:0:0: Direct-Access     ATA      ST32000542AS     CC34 PQ: 0 ANSI: 5
    [    0.940503] sd 2:0:0:0: [sda] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
    [    0.940541] sd 2:0:0:0: Attached scsi generic sg0 type 0
    [    0.940544] sd 2:0:0:0: [sda] Write Protect is off
    [    0.940546] sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
    [    0.940564] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
    [    0.940611] scsi 2:0:1:0: Direct-Access     ATA      ST32000542AS     CC34 PQ: 0 ANSI: 5
    [    0.940699] sd 2:0:1:0: Attached scsi generic sg1 type 0
    [    0.940728] sd 2:0:1:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
    [    0.945319] sd 2:0:1:0: [sdb] Write Protect is off
    [    0.945322] sd 2:0:1:0: [sdb] Mode Sense: 00 3a 00 00
    [    0.945660] sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
    [    0.993794]  sda: sda1 sda2 sda3 sda4
    [    1.023974] sd 2:0:0:0: [sda] Attached SCSI disk
    [    1.024277]  sdb: sdb1 sdb2 sdb3 sdb4
    [    1.024529] sd 2:0:1:0: [sdb] Attached SCSI disk
    
    
    [    1.537688] md: bind<sdb3>
    [    1.538922] bio: create slab <bio-1> at 1
    [    1.538983] md/raid1:md2: active with 2 out of 2 mirrors
    [    1.539005] md2: detected capacity change from 0 to 1850299318272
    [    1.540678]  md2: unknown partition table
    [    1.540851] md: bind<sdb4>
    [    1.542231] md/raid1:md3: active with 2 out of 2 mirrors
    [    1.542245] md3: detected capacity change from 0 to 97452032
    [    1.543867] md: bind<sdb2>
    [    1.544680]  md3: unknown partition table
    [    1.545627] md/raid1:md1: active with 2 out of 2 mirrors
    [    1.545642] md1: detected capacity change from 0 to 149999779840
    [    1.556008]    generic_sse:  9824.000 MB/sec
    [    1.556010] xor: using function: generic_sse (9824.000 MB/sec)
    [    1.556721] md: raid6 personality registered for level 6
    [    1.556723] md: raid5 personality registered for level 5
    [    1.556724] md: raid4 personality registered for level 4
    [    1.560491] md: raid10 personality registered for level 10
    [    1.571416]  md1: unknown partition table
    
    
    [    1.935835] EXT4-fs (md1): INFO: recovery required on readonly filesystem
    [    1.935838] EXT4-fs (md1): write access will be enabled during recovery
    [    2.901833] EXT4-fs (md1): orphan cleanup on readonly fs
    [    2.901840] EXT4-fs (md1): ext4_orphan_cleanup: deleting unreferenced inode 4981215
    [    2.901904] EXT4-fs (md1): ext4_orphan_cleanup: deleting unreferenced inode 8127848
    [    2.901944] EXT4-fs (md1): 2 orphan inodes deleted
    [    2.901946] EXT4-fs (md1): recovery complete
    [    3.343830] EXT4-fs (md1): mounted filesystem with ordered data mode. Opts: (null)
    [   64.851211] Adding 3903484k swap on /dev/mapper/vg-swap.  Priority:-1 extents:1 across:3903484k 
    
    [   67.600045] EXT4-fs (md1): re-mounted. Opts: errors=remount-ro
    [   68.459775] EXT3-fs: barriers not enabled
    [   68.460520] kjournald starting.  Commit interval 5 seconds
    [   68.461183] EXT3-fs (md3): using internal journal
    [   68.461187] EXT3-fs (md3): mounted filesystem with ordered data mode
    [  130.280048] REISERFS (device dm-1): found reiserfs format "3.6" with standard journal
    [  130.280060] REISERFS (device dm-1): using ordered data mode
    [  130.284596] REISERFS (device dm-1): journal params: device dm-1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
    [  130.284918] REISERFS (device dm-1): checking transaction log (dm-1)
    [  130.450867] REISERFS (device dm-1): Using r5 hash to sort names
    
    • ppetraki
      ppetraki over 12 years
      We need to know more about the storage that's actually part of the LVM. Are you using MD as well? Is the VG a pool of disks or just one basic hard disk? Thanks.
    • nOw2
      nOw2 over 12 years
      I've added some additional information. It seems odd to me that the /home and swap LVs are set to active automatically, but the rest are not (there are three others in addition to the example 'audio'). I'm sure there is something simple I've missed. I wonder how many others have seen the same issue after an upgrade? as my install isn't anything special.
    • nOw2
      nOw2 over 12 years
      I've added information from the broken state in case anything stands out.
  • ppetraki
    ppetraki over 12 years
    Very well, could you please vote and checkbox the appropriate responses so we can close out this issue? Thanks.
  • nOw2
    nOw2 over 12 years
    My reputation here isn't high enough to give you a vote up :-) but I've clicked the accept my solution as the answer. Just happy that the machine is working now, though I'll certainly think twice about using do-release-upgrade again!