RAIDing with LVM vs MDRAID - pros and cons?

79,148

Solution 1

How mature and featureful is LVM RAID?

LVM-RAID is actually mdraid under the covers. It basically works by creating two logical volumes per RAID device (one for data, called "rimage"; one for metadata, called "rmeta"). It then passes those off to the existing mdraid drivers. So things like handling disk read errors, I/O load balancing, etc. should be fairly mature.

That's the good news.

Tools

You can't use mdadm on it (at least, not in any easy way¹) and the LVM RAID tools are nowhere near as mature. For example, in Debian Wheezy, lvs can't tell you RAID5 sync status. I very much doubt repair and recovery (especially from "that should never happen!" situations) is anywhere near as good as mdadm (and I accidentally ran into one of those in my testing, and finally just gave up on recovering it—recovery with mdadm would have been easy).

Especially if you're not using the newest versions of all the tools, it gets worse.

Missing Features

Current versions of LVM-RAID do not support shrinking (lvreduce) a RAID logical volume. Nor do they support changing the number of disks or RAID level (lvconvert gives an error message saying not supported yet). lvextend does work, and can even grow RAID levels that mdraid only recently gained support for, such as RAID10. In my experience, extending LVs is much more common than reducing them, so that's actually reasonable.

Some other mdraid features aren't present, and especially you can't customize all the options you can in with mdadm.

On older versions (as found in, for example, Debian Wheezy), LVM RAID does not support growing, either. For example, on Wheezy:

root@LVM-RAID:~# lvextend -L+1g vg0/root
Extending logical volume root to 11.00 GiB
Internal error: _alloc_init called for non-virtual segment with no disk space.

In general, you don't want to run the Wheezy versions.

The above is once you get it installed. That is not a trivial process either.

Tool problems

Playing with my Jessie VM, I disconnected (virtually) one disk. That worked, the machine stayed running. lvs, though, gave no indication the arrays were degraded. I re-attached the disk, and removed a second. Stayed running (this is raid6). Re-attached, still no indication from lvs. I ran lvconvert --repair on the volume, it told me it was OK. Then I pulled a third disk... and the machine died. Re-inserted it, rebooted, and am now unsure how to fix. mdadm --force --assemble would fix this; neither vgchange nor lvchange appears to have that option (lvchange accepts --force, but it doesn't seem to do anything). Even trying dmsetup to directly feed the mapping table to the kernel, I could not figure out how to recover it.

Also, mdadm is a dedicated tool just for managing RAID. LVM does a lot more, but it feels (and I admit this is pretty subjective) like the RAID functionality has sort of been shoved in there; it doesn't quite fit.

How do you actually install a system with LVM RAID?

Here is a brief outline of getting it installed on Debian Jessie or Wheezy. Jessie is far easier; note if you're going to try this on Wheezy, read the whole thing first…

  1. Use a full CD image to install, not a netinst image.

  2. Proceed as normal, get to disk partitioning, set up your LVM physical volumes. You can put /boot on LVM-RAID (on Jessie, and on Wheezy with some work detailed below).

  3. Create your volume group(s). Leave it in the LVM menu.

  4. First bit of fun—the installer doesn't have the dm-raid.ko module loaded, or even available! So you get to grab it from the linux-image package that will be installed. Switch to a console (e.g., Alt-F2) and:

    cd /tmp
    dpkg-deb --fsys-tarfile /cdrom/pool/main/l/linux/linux-image-*.deb | tar x
    depmod -a -b /tmp
    modprobe -d /tmp dm-raid
    
  5. The installer doesn't know how to create LVM-RAID LVs, so you have to use the command line to do it. Note I didn't do any benchmarking; the stripe size (-I) below is entirely a guess for my VM setup:

    lvcreate --type raid5 -i 4 -I 256 -L 10G -n root vg0
    
  6. On Jessie, you can use RAID10 for swap. On Wheezy, RAID10 isn't supported. So instead you can use two swap partitions, each RAID1. But you must tell it exactly which physical volumes to put them on or it puts both halves of the mirror on the same disk. Yes. Seriously. Anyway, that looks like:

    lvcreate --type raid1 -m1 -L 1G -n swap0 vg0 /dev/vda1 /dev/vdb1
    lvcreate --type raid1 -m1 -L 1G -n swap1 vg0 /dev/vdc1 /dev/vdd1
    
  7. Finally, switch back to the installer, and hit 'Finish' in the LVM menu. You'll now be presented with a lot of logical volumes showing. That's the installer not understanding what's going on; ignore everything with rimage or rmeta in their name (see the first paragraph way above for an explanation of what those are).

  8. Go ahead and create filesystems, swap partitions, etc. as normal. Install the base system, etc., until you get to the grub prompt.

  9. On Jessie, grub2 will work if installed to the MBR (or probably with EFI, but I haven't tested that). On Wheezy, install will fail, and the only solution is to backport Jessie's grub2. That is actually fairly easy, it compiles cleanly on Wheezy. Somehow, get your backported grub packages into /target (or do it in a second, after the chroot) then:

    chroot /target /bin/bash
    mount /sys
    dpkg -i grub-pc_*.deb grub-pc-bin_*.deb grub-common_*.deb grub2-common_*.deb 
    grub-install /dev/vda … grub-install /dev/vdd # for each disk
    echo 'dm_raid' >> /etc/initramfs-tools/modules
    update-initramfs -kall -u
    update-grub # should work, technically not quite tested²
    umount /sys
    exit
    
  10. Actually, on my most recent Jessie VM grub-install hung. Switching to F2 and doing while kill $(pidof vgs); do sleep 0.25; done, followed by the same for lvs, got it through grub-install. It appeared to generate a valid config despite that, but just in case I did a chroot /target /bin/bash, made sure /proc and /sys were mounted, and did an update-grub. That time, it completed. I then did a dpkg-reconfigure grub-pc to select installing grub on all the virtual disks' MBRs.

  11. On Wheezy, after doing the above, select 'continue without a bootloader'.

  12. Finish the install. It'll boot. Probably.

Community Knowledge

There are a fair number of people who know about mdadm, and have a lot of deployment experience with it. Google is likely to answer most questions about it you have. You can generally expect a question about it here to get answers, probably within a day.

The same can't be said for LVM RAID. It's hard to find guides. Most Google searches I've run instead find me stuff on using mdadm arrays as PVs. To be honest, this is probably largely because it's newer, and less commonly used. Somewhat, it feels unfair to hold this against it—but if something goes wrong, the much larger existing community around mdadm makes recovering my data more likely.

Conclusion

LVM-RAID is advancing fairly rapidly. On Wheezy, it isn't really usable (at least, without doing backports of LVM and the kernel). Earlier, in 2014, on Debian testing, it felt like an interesting, but unfinished idea. Current testing, basically what will become Jessie, feels like something that you might actually use, if you frequently need to create small slices with different RAID configurations (something that is an administrative nightmare with mdadm).

If your needs are adequately served by a few large mdadm RAID arrays, sliced into partitions using LVM, I'd suggest continuing to use that. If instead you wind up having to create many arrays (or even arrays of logical volumes), consider switching to LVM-RAID instead. But keep good backups.

A lot of the uses of LVM RAID (and even mdadm RAID) are being taken over by things like cluster storage/object systems, ZFS, and btrfs. I recommend also investigating those, they may better meet your needs.


Thank yous

I'd like to thank psusi for getting me to revisit the state of LVM-RAID and update this post.

Footnotes

  1. I suspect you could use device mapper to glue the metadata and data together in such a way that mdadm --assemble will take it. Of course, you could just run mdadm on logical volumes just fine... and that'd be saner.

  2. When doing the Wheezy install, I failed to do this first time, and wound up with no grub config. I had to boot the system by entering all the info at the grub prompt. Once booted, that worked, so I think it'll work just fine from the installer. If you wind up at the grub prompt, here are the magic lines to type:

    linux /boot/vmlinuz-3.2.0-4-amd64 root=/dev/mapper/vg0-root
    initrd /boot/initrd.image-3.2.0-4-amd64
    boot
    

PS: It's been a while since I actually did the original experiments. I have made my original notes available. Note that I have now done more recent ones, covered in this answer, and not in those notes.

Solution 2

I didn't know LVM could do RAID either. Personally, I would stick with mdadm since it's a much more mature software that does the same thing. If something breaks with LVM RAID, you're probably not going to be able to get as much support than if you had gone with mdadm. Additionally, I wouldn't trust LVM RAID since LVM has historically shown to not be the most robust software.

ZFS and BTRFS are the future. The benefits they give go beyond what's possible at the block layer. Unless I'm aiming for compatibility I won't be using LVM/mdadm anymore. ZFS and BTRFS have a lot of features like compression, deduplication, and copy-on-write, but I won't go into that here, as it would be a bit out of scope.

In the end, do your research and use whatever suits your needs/wants.

Share:
79,148

Related videos on Youtube

Alen Milakovic
Author by

Alen Milakovic

Updated on September 18, 2022

Comments

  • Alen Milakovic
    Alen Milakovic almost 2 years

    In his answer to the question "mixed raid types", HBruijn suggests using LVM to implement RAID vs the more standard MDRAID.

    After a little investigation, it seems LVM also supports RAID functionality. In the past, I have used LVM on top of MDRAID, and was not aware till now that LVM also supports RAID functionality. This seems to be a relatively recent development, but I have not found out exactly when this was implemented.

    So, these are alternative ways to implement software RAID on Linux. What are the pros and cons of these two different approaches? I'm looking for feature comparisons between the two approaches so people can decide which is better for them. Conclusions based on experimentation (as in, this feature doesn't work as well as this feature and here is why) are also Ok, provided you include your data in the answer.

    Some specific issues to address:

    1. Suppose I want to do sw RAID + LVM (a common scenario). Should I use LVM's support for sw RAID and thus use one utility instead of two? Does this more integrated approach have any advantages?
    2. Does LVMs support for sw RAID have significant deficiencies compared to the more mature MDADM? Specifically, how stable/bug-free is the LVM support for sw RAID? It seems this support only goes back to 2011 (see below), while MDADM is much older. Also, how does it compare in terms of feature set? Does it have significant feature deficiencies compared to MDADM? Conversely, does it have support for any sw RAID features that MDADM does not have?

    NOTES:

    1. There is a detailed discussion at http://www.olearycomputers.com/ll/linux_mirrors.html but I could not find out what date it was written on.

      Similar question on Serverfault: linux LVM mirror vs. MD mirror. However, this question was asked in 2010, and the answers may be out of date.

    2. The changelog entry for version 2.02.87 - 12th August 2011 has

      Add configure --with-raid for new segtype 'raid' for MD RAID 1/4/5/6 support

      So, it looks like RAID support in LVM is about 3 years old.

    • Bratchley
      Bratchley almost 10 years
      One advantage I can think of is using HA-related functions that are available for LVM.
    • Alen Milakovic
      Alen Milakovic almost 10 years
      @JoelDavis Can you elaborate? I don't know what HA-related functions means.
    • Bratchley
      Bratchley almost 10 years
      Also, there's the usualy advantage that you're working with logical volumes instead of md volumes. So you have lvextend and pvmove available for moving between devices whereas with md the process is a lot more manual without clear benefit.
    • Bratchley
      Bratchley almost 10 years
      "HA" means "High Availability" with LVM you have things like clvm and HA-LVM
    • Alen Milakovic
      Alen Milakovic almost 10 years
      @JoelDavis Ok, I see.
    • Bratchley
      Bratchley almost 10 years
      I don't know enough on the topic to write an authoritative answer, I just know enough to contribute to the discussion.
    • Alen Milakovic
      Alen Milakovic almost 10 years
      @JoelDavis if answers needed to be authoritative, there would be a lot fewer answers on SE. Having said that, if you are not comfortable answering, I won't press you.
    • frostschutz
      frostschutz almost 10 years
      LVM tends to be kinda effy when it comes to missing disks; so while LVM RAID is pretty much the same as mdadm RAID I still feel that mdadm is better suited for the job. But that'd be an answer based on opinions only...
    • Bratchley
      Bratchley almost 10 years
      @frostschutz How do you mean about the missing disks?
    • psusi
      psusi over 9 years
      LVM has always supported raid1 and raid0. It was more recently that they dropped their own implementation and instead internally use md's raid personality code, opening up the other raid levels.
  • Alen Milakovic
    Alen Milakovic over 9 years
    Thanks for the answer. Consider expanding it a bit about why you don't trust LVM.
  • psusi
    psusi over 9 years
    You are in fact, getting that error due to the version of lvm in wheezy being extremely old and buggy; it works just fine for me here on Ubuntu 14.04 with lvm version 2.02.98 ( I believe jessie is at least that new ). Secondly, the command you showed failing is a resize -- reshape is something entirely different. Reshape means changing from raid1 to raid5, or from a 3 disk raid5 to a 4 disk raid5. That is still unsupported, but simply changing the size works fine.
  • derobert
    derobert over 9 years
    @psusi Ah, I thought they considered resize to be a reshape too (since it changes the RAID geometry, given nowhere near as much as changing number of disks does). I thought before I got an error trying it on Jessie as well, I will re-test that. (Have to build a new Jessie LVM-RAID VM, so it will take a bit.) Thank you.
  • Bratchley
    Bratchley over 9 years
    "ZFS and BTRFS are the future" Not necessarily. To my knowledge ZFS on Linux is still FUSE based so it's mostly used to get ZFS features without using Solaris or FreeBSD. BTRFS is awesome but it's not necessarily the future. For example, Red Hat seems to be moving more into the direction of LVM+XFS rather than BTRFS. They support BTRFS but that's more of an Oracle/SuSE thing.
  • Bratchley
    Bratchley over 9 years
    Not to say I don't like btrfs for some reason (I actually like it a lot). It's just that it's not the direction that a major distro like Red Hat is going into, and I can't think of anything BTRFS can do that LVM/XFS can't at least approximate fairly well.
  • Bratchley
    Bratchley over 9 years
    I'd also say that BTRFS will be nice but LVM can do a lot of stuff BTRFS just can't do (yet). For example you can do hybrid volumes and thin provision snapshots neither of which (AFAIK) BTRFS can do.
  • Jim Salter
    Jim Salter almost 9 years
    "To my knowledge ZFS on Linux is still FUSE based" this has been incorrect for well over five years. ZoL is a kernel module and fully production ready. I've been using it extensively since early 2010.
  • muru
    muru over 8 years
    dpkg-deb provides a cleaner way to extract files from deb packages, no need to hop around in stages.
  • derobert
    derobert over 8 years
    @muru is dpkg-deb available in the installer environment? It didn't used to be... Make sure to check Wheezy too.
  • Alen Milakovic
    Alen Milakovic over 8 years
    @derobert ar -x should work too to unpack deb archives, though it's clumsier. Also, how about updating your answer with the current status?
  • derobert
    derobert over 8 years
    @FaheemMitha I'll update again (and probably drop all the Wheezy stuff) when the first Stretch beta comes out. I'm sure you'll remind me then :-P
  • Alen Milakovic
    Alen Milakovic over 8 years
    @derobert Sounds like a plan.
  • ndemou
    ndemou over 7 years
    What an excellent answer! Thank you @derobert for the time and effort.
  • ThorSummoner
    ThorSummoner about 7 years
    is this post still accurate in 2017?
  • derobert
    derobert about 7 years
    @ThorSummoner I'm sure it needs updating. I need to find the time to do so.
  • A. Binzxxxxxx
    A. Binzxxxxxx about 5 years
    BTRFS is quite dead as far as I know since Red Hat decided not going that road. ZFS features might make sense for big storage clusters in companies but for personal NAS you don't need copy on write, de-dup and such. Beside that its costs are very high. CPU, RAM, disk space, checking the disks regularely. Linux support is still not on BSD level as far as I know. Beside that I heared its quite buggy in Linux when it comes to the advanced features. Ceph might be the future - if you would ask me. Till then MD/LVM + XFS is fine.
  • Tom Hale
    Tom Hale over 4 years
    This article says that extending an LVM mirror is possible, and give some caveats.
  • tlwhitec
    tlwhitec over 3 years
    This awesome answer would be so much worth updating! I'm sure some of the limitations are gone in 2021.
  • Hi-Angel
    Hi-Angel about 3 years
    Do you have a reference that LVM is mdraid under the covers? The reason I'm asking is because there is also dmraid, and from what I read so far it is more likely to be what LVM is using. Because LVM is a wrapper around device mapper, and e.g. dmsetup utility is a part of LVM project.
  • derobert
    derobert about 3 years
    @Hi-Angel LVM is a front-end to device mapper, and you can easily see its using the device mapper mdraid targets with dmsetup.
  • Marcus Müller
    Marcus Müller about 3 years
    @A.Binzxxxxxx hm. btrfs is the default FS on newer Fedoras. Redhat/IBM very much seem to also be exploring that space.