Linux Software RAID1: How to boot after (physically) removing /dev/sda? (LVM, mdadm, Grub2)
Solution 1
You need to install GRUB to the MBR of both drives, and you need to do it in a way that GRUB considers each disk to be the first disk in the system.
GRUB uses its own enumeration for disks, which is abstracted from what the Linux kernel presents. You can change which device it thinks is the first disk (hd0), by using a "device" line in the grub shell, like so:
device (hd0) /dev/sdb
This tells grub that, for all subsequent commands, treat /dev/sdb as the disk hd0. From here you can complete the installation manually:
device (hd0) /dev/sdb
root (hd0,0)
setup (hd0)
This sets up GRUB on the first partition of the disk it considers to be hd0, which you've just set as /dev/sdb.
I do the same for both /dev/sda and /dev/sdb, just to be sure.
Edited to add: I always found the Gentoo Wiki handy, until I did this often enough to commit it to memory.
Solution 2
Have you considered installing a third drive to serve as just the boot drive? I have seen problems too with raid 1 lvm setups (on CentOS) not being able to boot the second drive. I think the problem stems from grub not being able to handle native lvm partitions, although I'm not entirely sure.
Anyway, that's my answer: install a third small drive solely for the purpose of booting the system. Heck, I bet you could even get clever and do that with some sort of little flash or ssd device.
Solution 3
Grub should be able to recognize RAID1 setups and install to all slave disks when told to install to the MD device.
flight
Updated on September 17, 2022Comments
-
flight over 1 year
A server set up with Debian 6.0/squeeze. During the squeeze installation, I configured the two 500GB SATA disks (/dev/sda and /dev/sdb) as a RAID1 (managed with mdadm). The RAID keeps a 500 GB LVM volume group (vg0). In the volume group, there's a single logical volume (lv0). vg0-lv0 is formatted with extfs3 and mounted as root partition (no dedicated /boot partition). The system boots using GRUB2.
In normal use, the systems boots fine.
Also, when I tried and removed the second SATA drive (/dev/sdb) after a shutdown, the system came up without problem, and after reconnecting the drive, I was able to --re-add /dev/sdb1 to the RAID array.
But: After removing the first SATA drive (/dev/sda), the system won't boot any more! A GRUB welcome message shows up for a second, then the system reboots.
I tried to install GRUB2 manually on /dev/sdb ("grub-install /dev/sdb"), but that doesn't help.
Appearently squeeze fails to set up GRUB2 to launch from the second disk when the first disk is removed, which seems to be quite an essential feature when running this kind of Software RAID1, isn't it?
At the moment, I'm lost whether this is a problem with GRUB2, with LVM or with the RAID setup. Any hints?
-
Cedric Knight over 6 yearsInstead of
grub-install
, might want to trydpkg-reconfigure grub-pc
. wiki.debian.org/DebianInstaller/SoftwareRaidRoot says of installing to multiple drives 'your system will still boot correctly even if you reorder your drives'. In theory. I also want to cross-reference to my question: serverfault.com/questions/869559/…
-
-
flight about 13 yearsYou're talking GRUB1. GRUB2 doesn't have a a
setup
command in the shell. -
flight about 13 yearsThat's what I thought as well ;-), and yes, the Debconf frontend to grub-pc suggested an install in /dev/sda as well as /dev/sdb (and /dev/dm-0, where it failed to install subsequently). Still, it wouldn't boot with the second disk only.
-
flight about 13 yearsMy current solution is to boot from a USB stick with GRUB2 on it (and a /boot filesystem, which is not exactly necessary, I think).
-
flight about 13 yearsStill I refrain from accepting this answer, since this ought to work without a third drive. From what I can tell, this has to be a bug in GRUB2 (in Debian Squeeze).
-
Phil Hollenback about 13 yearsSure, that's a reasonable assumption. I just wanted to point out that I've seen weird lvm/raid/grub issues before, and solved it via a third drive via beating my head against weird annoying boot-time bugs.
-
ddm-j about 13 yearsI dimly remember that one had to point it at the MD device rather than at the components, but I may be confusing that with LILO here.
-
Daniel Lawson about 13 yearsYou're probably right there. Carry on :)
-
Cedric Knight over 6 yearsThis appears to be how
grub-install
works with GRUB 1.99 and 2.02. In whatever way sda+sdb RAID1 holds your boot partition, the core is likely to be referenced by UUID (check my linked question to see if it is). So if yougrub-install /dev/sda; grub-install /dev/sdb
, it doesn't matter if you remove one of those drives: so long as the BIOS can load MBR from one of them, it will find the RAID UUID and LV by searching.