ZFS vs XFS

66,167

Solution 1

I've found XFS more well suited to extremely large filesystems with possibly many large files. I've had a functioning 3.6TB XFS filesystem for over 2 years now with no problems. Definately works better than ext3, etc at that size (especially when dealing with many large files and lots of I/O).

What you get with ZFS is device pooling, striping and other advanced features built into the filesystem itself. I can't speak to specifics (I'll let others comment), but from what I can tell, you'd want to use Solaris to get the most benefit here. It's also unclear to me how much ZFS helps if you're already using hardware RAID (as I am).

Solution 2

ZFS will give you advantages beyond software RAID. The command structure is very thoughtfully laid out, and intuitive. It's also got compression, snapshots, cloning, filesystem send/receive, and cache devices (those fancy new SSD drives) to speed up indexing meta-data.

Compression:

#zfs set compression=on filesystem/home

It supports simple to create copy-on-write snapshots that can be live-mounted:

# zfs snapshot filesystem/home/user@tuesday
# cd filesystem/home/user/.zfs/snapshot/tuesday

Filesystem cloning:

# zfs clone filesystem/home/user@tuesday filesystem/home/user2

Filesystem send/receive:

# zfs send filesystem/home/user@tuesday | ssh otherserver "zfs receive -v filesystem/home/user"

Incremental send/receive:

# zfs send -i filesystem/home/user@tuesday | ssh otherserver "zfs receive -v filesystem/home/user"

Caching devices:

# zpool add filesystem cache ssddev

This is all just the tip of the iceberg, I would highly recommend getting your hands on an install of Open Solaris and trying this out.

http://www.opensolaris.org/os/TryOpenSolaris/

Edit: This is very old, Open Solaris has been discontinued, the best way to use ZFS is probably on Linux, or FreeBSD.


Full disclosure: I used to be a Sun storage architect, but I haven't worked for them in over a year, I'm just excited about this product.

Solution 3

using lvm snapshots and xfs on live filesystems is a recipe for disaster especially when using very large filesystems.

I've been running exclusively on LVM2 and xfs for the last 6 years on my servers (at home even since zfs-fuse is just plain too slow)...

However, I can no longer count the different failure modes I encountered when using snapshots. I've stopped using them altogether - it's just too dangerous.

The only exception I'll make now is my own personal mailserver/webserver backup, where I'll do overnight backups using an ephemeral snapshot, that is always equal the size of the source fs, and gets deleted right afterwards.

Most important aspects to keep in mind:

  1. if you have a big(ish) filesystem that has a snapshot, write performance is horribly degraded
  2. if you have a big(ish) filesystem that has a snapshot, boot time will be delayed with literally tens of minutes while the disk will be churning and churning during import of the volume group. No messages will be displayed. This effect is especially horrid if root is on lvm2 (because waiting for the root device will timeout and system doesn't boot)
  3. if you have a snapshot it is very easy to run out of space. Once you run out of space, the snapshot is corrupt and cannot be repaired.
  4. Snapshots cannot be rolledback/merged at the moment (see http://kerneltrap.org/Linux/LVM_Snapshot_Merging). This means the only way to restore data from a snapshot is to actually copy (rsync?) it over. DANGER DANGER: you do not want to do this if the snapshot capacity is not at least the size of the source fs; If you don't you'll soon hit the brick wall and end up with both the source fs and the snapshot corrupted. (I've been there!)

Solution 4

A couple additional things to think about.

  • If a drive dies in a hardware RAID array regardless of the filesystem that's on top of it all the blocks on the device have to be rebuilt. Even the ones that didn't hold any data. ZFS on the other hand is the volume manager, the filesystem, and manages data redundancy and striping. So it can intelligently rebuild only the blocks that contained data. This results in faster rebuild times other than when the volume is 100% full.

  • ZFS has background scrubbing which makes sure that your data stays consistent on disk and repairs any issues it finds before it results in data loss.

  • ZFS file systems are always in a consistent state so there is no need for fsck.

  • ZFS also offers more flexibility and features with it's snapshots and clones compared to the snapshots offered by LVM.

Having run large storage pools for large format video production on a Linux, LVM, XFS stack. My experience has been that it's easy to fall into micro-managing your storage. This can result in large amounts of unused allocated space and time/issues with managing your Logical Volumes. This may not be a big deal if you have a full time storage administrator who's job is to micro-manage the storage. But I've found that ZFS's pool storage approach removes these management issues.

Solution 5

ZFS is absolutely amazing. I am using it as my home file server for a 5 x 1 TB HD file server, and am also using it in production with almost 32 TB of hard drive space. It is fast, easy to use and contains some of the best protection against data corruption.

We are using OpenSolaris on this server in particular because we wanted to have access to newer features and because it provided the new package management system and way of upgrading.

Share:
66,167

Related videos on Youtube

Tamas Czinege
Author by

Tamas Czinege

Updated on September 17, 2022

Comments

  • Tamas Czinege
    Tamas Czinege almost 2 years

    We're considering building a ~16TB storage server. At the moment, we're considering both ZFS and XFS as filesystem. What are the advantages, disadvantages? What do we have to look for? Is there a third, better option?

    • Sam Go
      Sam Go over 13 years
      Don't even compare them. ZFS is a modern enterprise-level file system like jfs2, wafl. XFS was good 10 years ago but today it's just a stone age fs.
    • Mei
      Mei over 12 years
      In some ways, you can't compare them: XFS is a filesystem; ZFS is a filesystem and so much more: it replaces the filesystem, the volume manager (like LVM), and RAID besides. However, JFS is no longer maintained if memory serves: however, XFS is active and maintained and robust. Either way - ZFS or XFS - you can't go wrong in my opinion.
    • SvennD
      SvennD over 7 years
      I still think this question is relevant, so Ill write our experience here : XFS is simple, you install it, you run it, its quick, it works. (HW raid below). ZFS is save, has compression, but is allot of work to get tuned to work as fast as XFS. So it also depends on the situation you are expecting the server to run. (backend of cluster. user storage, archive, ...)
    • skan
      skan about 7 years
      There is also Hammer2 dragonflybsd.org/hammer
  • Brian Gianforcaro
    Brian Gianforcaro about 15 years
    FreeBSD has a mature native port of ZFS
  • Kjetil Limkjær
    Kjetil Limkjær about 15 years
    wiki.freebsd.org/ZFSKnownProblems I think your definition of mature might be different from mine :-) Maybe I'd consider it after 8.0 is released.
  • Avery Payne
    Avery Payne about 15 years
    The key feature of ZFS that you (usually) don't get elsewhere is block-level CRC, which is supposed to detect (and hopefully prevent) silent data corruption. Most filesystems assume that if a write completed successfully, then the data was indeed written to disk. That isn't always the case, especially if a sector is starting to go "marginal". ZFS detects this by checking the CRC against the resulting write.
  • Avery Payne
    Avery Payne about 15 years
    And yes, I do like XFS alot. :) The only gotcha that you have to keep in mind is the propensity to zero out sectors that were "bad" during a journal recovery. In some (rare) cases, you can end up with some data loss... Found this paper with the Google search term "xfs zeros out sectors upon recovery" pages.cs.wisc.edu/~vshree/xfs.pdf
  • iSee
    iSee about 15 years
    One of the things I like at XFS is the xfs_fsr "defragmentation" program.
  • Angiosperm
    Angiosperm almost 15 years
    There is also the option to use Nexenta: A Linux (Ubuntu) based distribution which uses the OpenSolaris kernel. It was created for (file) servers.
  • Walter
    Walter over 14 years
    FreeBSD 7.2 after 20090601 have rendered most of the ZFSKnownProblems moot. If you are running the AMD64 version of the OS, it is now stable. In 8.0, FreeBSD has marked ZFS as stable enough for Production.
  • sehe
    sehe over 14 years
    As it happens, just today someone confirmed that the vg with snapshot - unable-to-boot-linux is still current: bugs.launchpad.net/lvm2/+bug/360237
  • sehe
    sehe over 12 years
    Revisiting this bug, they still think that the abysmal boot problems with snaphots are "normal behaviour for lvm": bugs.launchpad.net/lvm2/+bug/360237/comments/7 (on 2012-01-07)
  • James Moore
    James Moore almost 11 years
    ZFS on Linux is available now (zfsonlinux.org)
  • aggregate1166877
    aggregate1166877 about 8 years
    That link didn't work for me with www. Use http://opensolaris.org/os/TryOpenSolaris/
  • Fox
    Fox about 8 years
    I'd actually say that best bet for zfs is still FreeBSD. It's been a part of the system for quite a few years. So my guess is, there's the least possibility for nasty surprises. Though it's just my $0.02.
  • sehe
    sehe almost 8 years
    Update: Same state. Only now it's been 7 more years.
  • Jody Bruchon
    Jody Bruchon over 7 years
    The utility of ZFS block-level CRCs is questionable. Hard drives and SSDs use Hamming code ECC to correct single-bit errors and report two-bit errors. If the ECC can't transparently correct the physical read error, the data is lost anyway and a read failure will be reported to the OS. CRCs don't correct errors. This feature is pushed as a major benefit of ZFS but the truth is it's redundant and has no value. As for the XFS zero-after-power-fail bug, that was corrected a long time ago and isn't relevant today.
  • shodanshok
    shodanshok almost 5 years
    @JodyLeeBruchon what you wrote is incorrect: while it is true that storage devices already have parity code attached to data, it does not means they are capable of end-to-end data protection. To achieve this goal without a chechsumming filesystem, you need a) a SAS T10/DIF/DIX storage stack or b) use devicemapper dm-integrity.
  • Jody Bruchon
    Jody Bruchon almost 5 years
    @shodanshok No, what I wrote is not incorrect. What you are saying is different from what I am saying. If you are going to "correct" me, at least read what I wrote and understand what it says first.
  • shodanshok
    shodanshok almost 5 years
    @JodyLeeBruchon you are free to think what you want, but a CRC/ECC which lives near the original data is not the same of end-to-end data checksum. If so, both the DIF/DIX specs and the dm-intregrity target would be wasted works. I recommend you to read the original CERN research paper about data corruption, and how end-to-end data checksum can be used to avoid these problems.
  • Jody Bruchon
    Jody Bruchon almost 5 years
    @shodanshok Again, you have failed to read and comprehend what I said. You are reading what you want to read, not what I actually said.
  • Admin
    Admin about 2 years
    It would be interesting to compare using ZFS in the same scenarios (snapshotting live system running the same software).
  • Admin
    Admin about 2 years
    @saulius2 I think there's no comparison, certainly not since ZfsOnLinux matured and became default or support for root in some Linux distros. By which I think snapshots in LVM2 have just been superseded by other volume management as in btrfs/zfs
  • Admin
    Admin about 2 years
    I am not sure LVM2 has been superseded yet, but yes, I would like it and can see this coming (albeit in a slow way). What I am not sure is whether ZFS snapshots may give less failures than LVM snapshots. My guess is that not: Live snapshots of any FS should be quite unreliable thing.
  • Admin
    Admin about 2 years
    @saulius2 Have you ever tried it? I've been using ZFS for 10 years, and I have automatic live snapshotting in the background without even noticing. The point is that ZFS/btrfs do snapshotting at the dataset level, not just blocklevel.