how can I boot linux from a software raid 1 array

6,866

Solution 1

For anybody else who ends up suffering the error 15 grief that I did, it turns out that the device naming scheme in grub (hd0, hd1, hd2...) ended up being different between when grub boots and when grub is running after the system is up and running. I spent a week with root (hd2,0) because that's what grub told me the drive I wanted was called. But when I dropped to the grub shell on bootup I was surprised to find out that what was hd2 when the machine is up, is hd1 on boot. So I changed the menu.lst to use root (hd1,0) and it started working. I hope to save somebody else lots of hair pulling with that one.

Solution 2

The thing about Grub is that is is invoked before the rest of the linux system is (obviously), so it doesn't know anything about your software raid. It only sees the bare hard drives.

So, it is very important to install grub on both drives of your RAID1 array. The BIOS will pick one to boot from, and if grub is not installed on that drive, it will not boot. (I found this out the hard way when one of my drives in a sw RAID1 configuration failed - the system refused to boot saying it had no boot partition.. the drive that had grub installed had failed, and I was left with a non-bootable HDD. Installing Grub on it fixed it)

So open grub at the prompt (you can do this with linux running) and at type:

grub

to get the grub prompt.

root(hd0,0)
setup(hd0)

root(hd1,0)
setup(hd1)

that sets grub to each of the first partitions on your drives (*** if your boot partition is elsewhere on the drive, change that 0 to reflect the correct partition) then setup installs grub boot files.

That should be everything you need to do. If it isn;t working correctly, are you sure you have the right boot partition, and that your drives are laid out identcally?

Solution 3

Grub doesn't know about your RAID device; it just reads direct from the drive, which (in a RAID-1 setup) is still fine, because an entire copy of the drive is right there (not chopped up into bits as it would be on a RAID-5 or RAID-10 configuration).

You haven't really provided enough info to determine what's going on though; what would handy would be:

  • Partition tables for all your drives;
  • RAID configuration details (output of /proc/mdstat, mdadm -E, etc)
Share:
6,866

Related videos on Youtube

Stu
Author by

Stu

Updated on September 17, 2022

Comments

  • Stu
    Stu almost 2 years

    I'm trying to make a raid array on an existing linux ubuntu install.

    I'm following this tutorial... http://howtoforge.org/software-raid1-grub-boot-fedora-8

    After going through the list of things a million times I finally understand what's going on. You make the raid device, on your new blank drive, copy your old / drive to it, set up the grub menu.lst, fstab, mtab initrd and grub MBR to all point to the raid device (which I have defined and is working) and then you reboot. Once you've booted, you now run in the raid device (/dev/md0) Then you merely hook your original drive up to the raid array, it syncs and voila you're done.

    So I set up my menu.lst to primarily load the kernel and initrd from the raid device, and failover to my original (still intact) old disk. And it always fails over when I reboot. I boot the machine, run my new grub entry and it says "error 15 file not found." Lots of stuff on the web about it, none seem to help.

    The only thing that's weird is when I go to setup the MBR with grub, you say "root (hd0,0)" which I finally understand what it means, and it's supposed to say Filesystem type is ext2fs, partition type 0xfd or somethingn like that. Mine says nothing. But when I run setup (hd0) and setup (hd2) it says it's doing the right thing to the right drive. So I assume it's working. but it can't load initrd/the kernel from the md0 device.

    The only other thing I'm thinking, is how on earth does grub know what a raid device is. The kernel hasn't loaded, the software raid modules haven't loaded, how can stupid little grub have any idea at all where to load initrd from? So I'm thinking, okay there's a mapping somewhere from /dev/md0 to /dev/sdc1 (the new raid drive) but I don't see where that could be happening. And for kicks, (I did this SO many times in various combinations) I tried setting the grub menu.lst to try and load the initrd and kernel from root=/dev/sdc1 (my new drive) and it still says file not found. So either the grub mbr setup isn't working, or I'm missing something really simple.

    Any ideas?

    Here's some more info...
    
    root@io:~# cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    md0 : active raid1 sdc1[1]
          18771840 blocks [2/1] [_U]
    
    
    
    root@io:~# fdisk -l
    
    Disk /dev/sda: 20.8 GB, 20847697920 bytes
    255 heads, 63 sectors/track, 2534 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Disk identifier: 0x9d949d94
    
       Device Boot      Start         End      Blocks   Id  System
    /dev/sda1   *           1        2337    18771921   83  Linux
    /dev/sda2            2338        2434      779152+   5  Extended
    /dev/sda5            2338        2434      779121   82  Linux swap / Solaris
    
    Disk /dev/sdb: 320.0 GB, 320072933376 bytes
    16 heads, 63 sectors/track, 620181 cylinders
    Units = cylinders of 1008 * 512 = 516096 bytes
    Disk identifier: 0x00000000
    
       Device Boot      Start         End      Blocks   Id  System
    /dev/sdb1   *           1        4064     2048224+  83  Linux
    /dev/sdb2            4065      620181   310522968   83  Linux
    
    Disk /dev/sdc: 20.0 GB, 20020396032 bytes
    255 heads, 63 sectors/track, 2434 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Disk identifier: 0x00000080
    
       Device Boot      Start         End      Blocks   Id  System
    /dev/sdc1   *           1        2337    18771921   fd  Linux raid autodetect
    /dev/sdc2            2338        2434      779152+   5  Extended
    /dev/sdc5            2338        2434      779121   82  Linux swap / Solaris
    
    Disk /dev/md0: 19.2 GB, 19222364160 bytes
    2 heads, 4 sectors/track, 4692960 cylinders
    Units = cylinders of 8 * 512 = 4096 bytes
    Disk identifier: 0x00000000
    
    Disk /dev/md0 doesn't contain a valid partition table
    
    
    
    root@io:~# mdadm -E
    mdadm: No devices to examine
    
    
    
    root@io:~# cat /etc/mdadm.conf
    ARRAY /dev/md0 level=raid1 num-devices=2 UUID=5248ed76:cba39cc2:3082255a:649c0d18
    root@io:~#
    
    
    
    root@io:~# cat /boot/grub/menu.lst
    
    default         0
    # 8/14/09 added this
    fallback        1
    
    ## timeout sec
    # Set a timeout, in SEC seconds, before automatically booting the default entry
    # (normally the first entry defined).
    timeout         3
    
    ## hiddenmenu
    # Hides the menu by default (press ESC to see the menu)
    hiddenmenu
    
    # added this 8/14/09 for raid boot, note this will get blown away on next kernel update
    # if it's after the magic marker
    # this means we will have to manually update this when there's a kernel upgrade :-(
    # in grub land hd0 = /dev/sda and hd1 = /dev/sdb and hd2 = /dev/sdc I hope
    # we're putting sdc first for now
    title           Ubuntu 8.04.3 LTS, kernel 2.6.24-24-generic (raid)
    root            (hd2,0)
    #kernel         /boot/vmlinuz-2.6.24-24-generic root=UUID=b11d6b08-fdfe-4b0d-adec-4e263455be23 ro
    kernel          /boot/vmlinuz-2.6.24-24-generic root=/dev/md0 ro
    initrd          /boot/initrd.img-2.6.24-24-generic
    quiet
    
    
    
    
    title           Ubuntu 8.04.3 LTS, kernel 2.6.24-24-generic
    root            (hd0,0)
    kernel          /boot/vmlinuz-2.6.24-24-generic root=UUID=d8c402cc-7445-4878-b3aa-c9568b740b51 ro
    initrd          /boot/initrd.img-2.6.24-24-generic
    quiet
    
    
    title           Ubuntu 8.04.3 LTS, kernel 2.6.24-24-generic (recovery mode)
    root            (hd0,0)
    kernel          /boot/vmlinuz-2.6.24-24-generic root=UUID=d8c402cc-7445-4878-b3aa-c9568b740b51 ro single
    initrd          /boot/initrd.img-2.6.24-24-generic
    
    
    
    root@io:~# blkid
    /dev/sda1: UUID="d8c402cc-7445-4878-b3aa-c9568b740b51" SEC_TYPE="ext2" TYPE="ext3"
    /dev/sda5: TYPE="swap" UUID="e0509276-30eb-4dcb-8e17-20f8244f5403"
    /dev/sdb1: LABEL="alt" UUID="ea1789eb-9d6f-47a9-a074-18121792b30a" SEC_TYPE="ext2" TYPE="ext3"
    /dev/sdb2: LABEL="sp" UUID="3b6d1173-f9fd-4a3e-8e5d-249fc682355b" SEC_TYPE="ext2" TYPE="ext3"
    /dev/sdc1: UUID="76ed4852-c29c-a3cb-5a25-8230180d9c64" TYPE="mdraid"
    /dev/md0: UUID="b11d6b08-fdfe-4b0d-adec-4e263455be23" SEC_TYPE="ext2" TYPE="ext3"
    
    
    
    • Stu
      Stu almost 15 years
      I think I see the problem. when I say in grub root(hd0,0) it isn't able to mount the partition's filesystem. But it doesn't say why.
  • David Pashley
    David Pashley almost 15 years
    The solution for your root partition to be RAID 5 or 10 is to make a small /boot partition and make that raid 1, as grub just needs to load your kernel and initrd.
  • Stu
    Stu almost 15 years
    yeah I did that on both drives and it says its working when I do setup, but no love. there's only one partition (and one swap) so it's not like I can screw that up. I copied the partition table from one drive to the other so they're identical. It's the file not found that I don't get, it's obviously looking at the non-original drive, but I can't see why it can't find the initrd.
  • Decebal
    Decebal almost 15 years
    then its more likely to be something simple. Try booting off each drive without the other present. If one works and the other doesn't, then maybe the mdr is screwed on one. I assume the partition has synced fully and isn't still in progress, that the files are present on both drives. Don't look at md0, look at sda1 and sdc1 instead. Lastly, add 'debug' to the menu.lst to start it up in debug mode. gnu.org/software/grub/manual/html_node/…
  • Stu
    Stu almost 15 years
    More progress. As somebody pointed out, grub knows not of raid, but since a raid 1 drive is a drive, it should be able to mount it and find /boot/initrd... So I tried it by hand. I ran grub interactive, said root (hd2) which is /dev/sdc which is now the working half of the raid array (that won't boot) Then I figured out this neat trick, I type root and hit tab for the completion, without the open paren, and it tries to mount the filesystem so it can display the file options, and it says "error 17" When I try mount... mount /dev/sdc1 sdc1 mount: unknown filesystem type 'linux_raid_member'
  • Stu
    Stu almost 15 years
    So although I can mount it as md0, I can no longer mount it as a plain drive, and neither can grub. So that's what's wrong. The question is why? Is there a newer version of grub that is raid 1 drive aware? It says Grub 0.97
  • Decebal
    Decebal almost 15 years
    Grub is not RAID aware. The filesystem layout is the same, but grub sees it as 2 separate drives. If you have your RAID "working" but the underlying drive is not.. then the raid system has not finished syncing the files. look at /proc/mdstat to see the progress. If in doubt, reformat the drive and start again, set up as RAID1, run the resync, wait for it to finish, then install grub on the new drive.
  • Stu
    Stu almost 15 years
    Thanks. I'll try making it again from scratch, but the drive I'm trying to mount is the good one. The one that it will be syncing FROM. I'm sooo confused.