Can a failed Btrfs drive in RAID-1 be replaced live?

6,741

Solution 1

In Linux 3.8, btrfs replace mountpoint old_disk new_disk was added. If you're running a recent kernel, it will provide the functionality you are looking for.

Solution 2

small correction, current syntax is:

btrfs replace start OLDDEV NEWDEV MOUNTPOINT

which then backgrounds.

You can check the status with

btrfs replace status MOUNTPOINT

which will show you a continuously updated status of the replace operation.

Share:
6,741

Related videos on Youtube

NothingsImpossible
Author by

NothingsImpossible

Updated on September 18, 2022

Comments

  • NothingsImpossible
    NothingsImpossible almost 2 years

    I am trying to decide on a filesystem and would like to know if it is possible to replace a failed drive in btrfs RAID without downtime.

    1. Suppose I create a new btrfs filesystem using the command

      mkfs.btrfs -d raid1 /dev/sdb /dev/sdc
      
    2. Now suppose one day /dev/sdc fails. There are two possibilities: it can fail gradually, showing S.M.A.R.T. errors - in this situation I can add a new device with btrfs device add /dev/sde /mnt; btrfs filesystem balance /mnt and then remove the old one with btrfs device delete /dev/sdc /mnt.

    3. But if it suddenly fails, becoming unreadable... A web search says in this situation I must first unmount the filesystem, mount in degraded mode, add a new device, then remove the missing device.

      umount /mnt
      mount -o degraded /dev/sdb /mnt
      btrfs device add /dev/sdf /mnt 
      btrfs device delete missing /mnt
      

    An unmount is obviously a disruptive operation so there would be downtime - any application using the filesystem would get an I/O error. But these kind of "tutorials" on btrfs look outdated, considering btrfs is under heavy development.

    Question is: considering current state of btrfs, is it possible to do this online, i.e. without unmounting?

    If not, there is a software-only solution that can fulfill this need?

    • Thalys
      Thalys over 10 years
      If one drive catches fire, the rest of your system is probably on fire too
    • NothingsImpossible
      NothingsImpossible over 10 years
      @JourneymanGeek Funny you.. :) I just wanted to make it very clear that I meant a catastrophic, sudden and unpredictable failure - the drive simply stops working. This is rather uncommon, hard disk usually fail gradually and with effective monitoring I can replace them before that happens, but what if...
    • Brian
      Brian over 10 years
      In Linux 3.8 btrfs replace mountpoint old_disk new_disk was added.
    • NothingsImpossible
      NothingsImpossible over 10 years
      @Brian woow... That is the answer. I googled for "btrfs replace" and this showed up lwn.net/Articles/524589 . It is _exactly_ what I was looking for. Please post it as an answer so I can accept it.
  • DavidPostill
    DavidPostill over 9 years
    This is not an answer to the original question. To critique or request clarification from an author, leave a comment below their post - you can always comment on your own posts, and once you have sufficient reputation you will be able to comment on any post.
  • basic6
    basic6 over 8 years
    This would now be btrfs replace start /dev/old /dev/new /mountpoint (start has been added). Also see man btrfs-replace.