ZFS - Is RAIDZ-1 really that bad?

107,550

Solution 1

Before we go into specifics, consider your use case. Are you storing photos, MP3's and DVD rips? If so, you might not care whether you permanently lose a single block from the array. On the other hand, if it's important data, this might be a disaster.

The statement that RAIDZ-1 is "not good enough for real world failures" is because you are likely to have a latent media error on one of your surviving disks when reconstruction time comes. The same logic applies to RAID5.

ZFS mitigates this failure to some extent. If a RAID5 device can't be reconstructed, you are pretty much out of luck; copy your (remaining) data off and rebuild from scratch. With ZFS, on the other hand, it will reconstruct all but the bad chunk, and let the administrator "clear" the errors. You'll lose a file/portion of a file, but you won't lose the entire array. And, of course, ZFS's parity checking means that you will be reliably informed that there's an error. Otherwise, I believe it's possible (although unlikely) that multiple errors will result in a rebuild apparently succeeding, but giving you back bad data.

Since ZFS is a "Rampant Layering Violation," it also knows which areas don't have data on them, and can skip them in the rebuild. So if your array is half empty you're half as likely to have a rebuild error.

You can reduce the likelihood of these kinds of rebuild errors on any RAID level by doing regular "zpool scrubs" or "mdadm checks"of your array. There are similar commands/processes for other RAID's; e.g., LSI/dell PERC raid cards call this "patrol read." These go read everything, which may help the disk drives find failing sectors, and reassign them, before they become permanent. If they are permanent, the RAID system (ZFS/md/raid card/whatever) can rebuild the data from parity.

Even if you use RAIDZ2 or RAID6, regular scrubs are important.

One final note - RAID of any sort is not a substitute for backups - it won't protect you against accidental deletion, ransomware, etc. Although regular ZFS snapshots can be part of a backup strategy.

Solution 2

There is a little bit of a misconception at work here. A lot of the advice you're seeing is based on an assumption which may not be true. Specifically, the unrecoverable bit error rate of your drive.

A cheap 'home user' disk has 1 per 10^14 unrecoverable error rate.

http://www.seagate.com/gb/en/internal-hard-drives/desktop-hard-drives/desktop-hdd/#specs

This is at a level where your're talking a significant likelihood of an unrecoverable error during a RAID rebuild, and so you shouldn't do it. (A quick and dirty calculation suggests that 5x 2TB disks RAID-5 set will actually have around a 60% chance of this)

However this isn't true for more expensive drives: http://www.seagate.com/gb/en/internal-hard-drives/enterprise-hard-drives/hdd/enterprise-performance-15k-hdd/#specs

1 per 10^16 is 100x better - meaning 5x 2TB is <1% chance of failed rebuild. (Probably less, because for enterprise usage, 600GB spindles are generally more useful).

So personally - I think both RAID-5 and RAID-4 are still eminently usable, for all the reasons RAID-0 is still fairly common. Don't forget - the problem with RAID-6 is it's hefty write penalty. You can partially mitigate this with lots of caching, but you've still got some pain built in, especially when you're working with slow drives in the first place.

And more fundamentally - NEVER EVER trust your RAID to give you full resilience. You'll lose data more often to an 'oops' than a drive failures, so you NEED a decent backup strategy if you care about your data anyway.

Solution 3

Hmmm, some bad information here. For 4 disks, there's really nothing wrong with XFS. I tend to avoid ZFS RAIDZ for performance and expandability reasons (low reads/writes, can't be expanded). Use ZFS mirrors if you can. However, with 4 disks and nowhere to place your OS, you'll either lose a lot of capacity or have to go through odd partitioning games to fit your OS and data onto the same four disks.

I'd probably not recommend ZFS for your use case. There's nothing wrong with XFS here.

Share:
107,550

Related videos on Youtube

Andrew Ensley
Author by

Andrew Ensley

With continuous experience beginning in 2004, I am a seasoned software developer, system administrator, database administrator, DevOps engineer, and team leader. My favorite areas of focus in development are performance, optimization, and security. I have experience with: Git, Java, C#, Python, TypeScript, Node.js, React.js, SQL, PHP, Linux, Windows, MacOS, Infrastructure-as-Code, Configuration-as-Code Azure, ADO, Jenkins, GitHub, Nexus RM/IQ, JFrog Artifactory, SonarQube, Docker, Kubernetes, Cloud Foundry, TAS, TKGi, DigitalOcean, Ansible, AWS Visual Studio Code, IntelliJ IDEA, Eclipse, PyCharm, Android Studio I love automation and open-source software.

Updated on September 18, 2022

Comments

  • Andrew Ensley
    Andrew Ensley almost 2 years

    I have an NAS server with 4x 2TB WD RE4-GP drives in a RAID10 configuration (4TB usable). I'm running out of space (<1TB usable space left). I have $0 to spend on bigger/more drives/enclosures.

    I like what I've read about the data-integrity features of ZFS, which - on their own - are enough for me to switch from my existing XFS (software) RAID10. Then I read about ZFS's superior implementation of RAID5, so I thought I might even get up to 2TB more usable space in the bargain using RAIDZ-1.

    However, I keep reading more and more posts saying pretty much to just never use RAIDZ-1. Only RAIDZ-2+ is reliable enough to handle "real world" drive failures. Of course, in my case, RAIDZ-2 doesn't make any sense. It'd be much better to use two mirrored vdevs in a single pool (RAID10).

    Am I crazy wanting to use RAIDZ-1 for 4x 2TB drives?

    Should I just use a pool of two mirrored vdevs (essentially RAID10) and hope the compression gives me enough extra space?

    Either way, I plan on using compression. I only have 8GB of RAM (maxed), so dedup isn't an option.

    This will be on a FreeNAS server (about to replace the current Ubuntu OS) to avoid the stability issues of ZFS-on-Linux.

    • Andrew Ensley
      Andrew Ensley over 9 years
      Not sure how this is off-topic. I'm asking for advice about the proper file system configuration for a server.
    • Dan Pritts
      Dan Pritts over 9 years
      If you have enough CPU to calculate parity without slowdown, RAIDZ should be as fast or faster than RAID10 for most writes. RAIDZ writes everything in a full RAID stripe, there is no read-modify-write cycle like with RAID5. So you'll get more disk bandwidth (more data, less overhead), and the writes should be faster than RAID10. However, this has the disadvantage that reads often end up slower. "Write a full stripe every time" leads to fragmentation, and doesn't give you the benefit of reading only a subset of the disks for many small reads. This was a conscious design decision.
    • Klaws
      Klaws over 5 years
      RAIDZ2 is more reliable than RAID10. With RAIDZ2, any two disks can fail and you will still have your data. With RAID10, two failed disks (in a four disk array) may cause data loss.
  • Andrew Ensley
    Andrew Ensley over 9 years
    Forgot to mention that the OS lives on a separate drive. Sorry. What I'm wanting from ZFS that the XFS RAID10 doesn't have is checksum data verification that can detect (and transparently fix) silent data errors (a bit flipped on the drive, and the HDD has no idea). I don't believe XFS is able to do this.
  • Andrew Ensley
    Andrew Ensley over 9 years
    Why avoid FreeNAS? The reason I intend to switch is because ZFS on Linux uses the Solaris Emulation Layer, which can break with a simple Linux kernel update and potentially nuke the zpool. ZFS runs natively on Unix-/BSD-based OSes and doesn't have that problem. confessionsofalinuxpenguin.blogspot.com/2012/09/…
  • ewwhite
    ewwhite over 9 years
    DKMS takes care of the kernel updates and ZFS package changes in Linux. I've been using ZFS on Linux in production since 2012, though. FreeNAS does some quirky things to the pool disks, and we've had a ton of misconfiguration and questions about weird FreeNAS failure modes. I don't think it's worth using just to get a GUI. Just an opinion, though. ZFS on Linux works well.
  • Andrew Ensley
    Andrew Ensley over 9 years
    I'm a terminal guy myself, so I'm definitely not switching for the GUI. Mostly, I just need a stable file system that (as much as is possible) guarantees the integrity of the files stored on it. And I was hoping to gain some space in the process. I've seen a lot of issues reported for ZoL, many of them relating to Ubuntu OS upgrades. groups.google.com/a/zfsonlinux.org/forum/#!searchin/zfs-disc‌​uss/… Not trying to be a pain. Just explaining why I think what I think. I'm certainly open to correction.
  • ewwhite
    ewwhite over 9 years
    That's fine. I've seen far more issues with FreeNAS (not FreeBSD), so it goes both ways. There's info out there. I don't use Ubuntu, but I do know ZFS. My ZFS on Linux is usually with RHEL or CentOS. Here's a sample workflow.
  • Andrew Ensley
    Andrew Ensley over 9 years
  • Dan Pritts
    Dan Pritts over 9 years
    I have a raidz1 running on 7 consumer drives of various ages. I have it scrub every 2 weeks. It often finds an error and corrects it. I recently lost a drive and lost a file which had a latent error. Luckily, it was a media file that I can easily replace. For my important data I still, of course, have backups.
  • Dan Pritts
    Dan Pritts over 9 years
    I use ZFS on Linux and Centos 6. I don't allow automatic updates of the kernel or of ZFS. I've had issues with ZFS/SCL borking, but I have never had data loss. For the record, btw, FreeBSD has a similar set of solaris compatibility routines, but they and ZFS are fully integrated into the distribution, which makes it a lot simpler to make things all work together. If I only wanted ZFS and file service, I'd probably run FreeBSD. In fact, that's what I used to do, but I use the box for other random stuff, which made ZoL more appealing.
  • Sobrique
    Sobrique over 9 years
    I will point out - 'home' drives have 2 orders of magnitude worse unrecoverable bit error rate when squared off against 'enterprise' grade. I'm still quite happy that the compound failure rate on RAID-5 is acceptable on decent FC/SAS drives. Wouldn't do it on SATA though.
  • Dan Pritts
    Dan Pritts over 9 years
    Only one order of magnitude comparing two seagate drives: Seagate ST2000DM001: 1 in 10E14. ST2000NM0033: 1 in 10E15. Really, though, tough to say for sure whether the drive mechanisms are any different. I've heard credible sources give opposing answers.
  • Dan Pritts
    Dan Pritts almost 9 years
    I discovered a bad SATA cable on my system - since it was replaced, my scrubs have found zero errors.
  • Dan Pritts
    Dan Pritts over 5 years
    The RAID6 write penalty is very real. However, RAID-Z2 does not suffer from it; zfs makes all writes full-stripe. This has other negative effects, though - it tends to reduce read performance for several reasons.
  • Tmanok
    Tmanok over 3 years
    Wow exceptional reminder about unrecoverable error rates, thinking about that would normally give my head spins but this is a very important warning!
  • ghostly_s
    ghostly_s over 2 years
    I'd like to challenge the assumptions of your "might not care" use case here. Sure, mp3s and dvd rips might be easily-replaceable (assuming you're saving the original media), but most people consider photos to be irreplaceable -- and here's an example of how a single bit flip can destroy a photo irreparably: upload.wikimedia.org/wikipedia/commons/thumb/1/1a/…
  • Dan Pritts
    Dan Pritts over 2 years
    I'll stand by "MIGHT not care."