ZFS Sync over unreliable, slow WAN. ZFS replication, or rsync?

16,840

Solution 1

  1. If you can transfer a maximum of 6GB per day (assuming zero overhead and zero competing traffic) and you are needing to move "15-60 gigs" at a frequency of "once or twice per week," that works out to 15-120 GB per week, or anywhere from 2-17 GB per day. Because it is necessary to plan for peak demand, and 17 GB is far in excess of even your theoretical maximum of 6 GB, it is likely that you have a very serious bandwidth problem. What will it take to upgrade the connection? If upgrading the connection is impossible, please consider the option of mailing physical media on a scheduled basis (e.g. weekly).

  2. Assuming that you can get the bandwidth math to make a little bit more sense, rsync is likely to be the best option. Deduplication awareness would be hugely valuable when replicating highly redundant data (e.g. virtual machine images), but it should have little or no benefit when it comes to unique digital content (audio, video, photos)... unless, of course, users are inadvertently storing duplicate copies of identical files.

Solution 2

After doing some research I believe you are right about sending snapshots. The ZFS SEND and RECEIVE commands can be piped into bzip2 and then that file can be rsync-ed to the other machine.

Here are some sources I used:

I hadn't found any posts with replication scripts posted, but I did find someone that posted their backup script. That said, I didn't understand it so it may be junk.

Many of the website talked about setting up a cron job to do this frequently. If this is the case, you could replicate/backup with less impact to bandwidth and users and be a good disaster recovery feature because the offsite data is more up to date. (That is, after the initial chunk of data when getting started.)

Again, I think you had the right idea sending snapshots there seems to be a lot of advantages to using SEND / RECEIVE.

EDIT: Just watched a video1 video2 that may helps suports the use of SEND/RECEIVE and talks about rsync (starts at 3m49s). Ben Rockwood was the speaker and here is a link to his blog.

Solution 3

ZFS should receive the 'resumable send' feature, which will allow continuing an interrupted replication some time around March of this year. The feature has been completed by Matt Ahrens and some other people, and should be upstreamed soon.

Solution 4

What is the purpose of the backups and how will they need to be accessed?

If your backups are mainly for disaster recovery then ZFS snapshots might be preferable as you'll be able to get a filesystem back to the exact state it was in at the time of the last incremental.

However, if your backups are also supposed to provide users access to files that might have been accidentally deleted, corrupted, etc. then rsync could be a better option. End users may not understand the concept of snapshots or perhaps your NAS doesn't provide end users access to previous snapshots. In either case you can use rsync to provide a backup that is easily accessible to the user via the filesystem.

With rsync you can use the --backup flag to preserve backups of files that have been changed, and with the --suffix flag you can control how old versions of files are renamed. This makes it easy to create a backup where you might have dated old versions of files like

file_1.jpg
file_1.jpg.20101012
file_1.jpg.20101008
etc.

You can easily combine this with a cronjob containing a find command to purge any old files as needed.

Both solutions should be able to preserve enough metainformation about files to work as a backup (rsync provides --perms, --owner etc. flags). I use rsync to backup large amounts of data between datacenters and am very happy with the setup.

Share:
16,840

Related videos on Youtube

Paul McMillan
Author by

Paul McMillan

I do Security Engineering work for clouds and open source projects.

Updated on September 17, 2022

Comments

  • Paul McMillan
    Paul McMillan almost 2 years

    I've been tasked with making an off-site backup work over the WAN. Both storage boxes are FreeBSD based NAS boxes running ZFS.

    Once or twice a week, 15-60 gigs of photography data gets dumped to the office NAS. My job is to figure out how to get this data off-site as reliably as possible using the VERY SLOW DSL connection (~700Kb/s upload). The receiving box is in much better shape, at 30Mb/s down, 5Mb/s up.

    I know, carrying a hard drive off-site would move data much more quickly, but it's not an option in this case.

    My options seem to be either:

    • ZFS incremental send over ssh
    • Rsync

    rsync is a time honored solution, and has the all-important ability to resume a send if something gets interrupted. It has the disadvantage of iterating over many files and not knowing about dedup.

    ZFS snapshot sending might transfer a bit less data (it knows a lot more about the file system, can do dedup, can package up the metadata changes more efficiently than rsync) and has the advantage of properly duplicating the filesystem state, rather than simply copying files individually (which is more disk intensive).

    I'm concerned about ZFS replication performance[1] (though that article is a year old). I'm also concerned about being able to re-start the transfer if something goes down - the snapshot capability doesn't seem to include that. The whole system needs to be completely hands-off.

    [1] http://wikitech-static.wikimedia.org/articles/z/f/s/Zfs_replication.html

    Using either option, I should be able to de-prioritize the traffic by routing it through a specified port, then using the QOS on the routers. I need to avoid a major negative impact on users at both sites during each transfer, since it will take several days.

    So... that's my thinking on the issue. Have I missed any good options? Has anyone else set something similar up?

  • Paul McMillan
    Paul McMillan over 13 years
    I figure I can use the available bandwidth, and most of the data dumps tend towards the smaller end of the range. Practically, it's gonna be around 2-3 gigs a day average, judging from a past month of data. I don't need the replication immediately.
  • Paul McMillan
    Paul McMillan over 13 years
    And yeah, mailing physical media is far better... I wish it were an option.
  • Paul McMillan
    Paul McMillan over 13 years
    Good point about dedup. Most of what gets copied won't be duplicated - the users aren't quite that dense.
  • Paul McMillan
    Paul McMillan over 13 years
    Hmm... I've read that with proper configuration, rsync can be made to go relatively quickly. How much optimization did you attempt?
  • Paul McMillan
    Paul McMillan over 13 years
    I guess the use of rsync there is limited to the pause/resume functionality, rather than the actual file diffing. This makes sense, since the file system itself (and the change files it generates) knows better than rsync what's going on.
  • Allan Jude
    Allan Jude almost 6 years
    Just a note that 'resumable send' has been in OpenZFS (on FreeBSD, Linux, MacOS, etc) for quite some time now. There is also a 'compressed send' feature now, where data will stay compressed as it is on disk, as part of the replication stream.
  • Allan Jude
    Allan Jude almost 6 years
    As an additional note: ZSTD, a modern faster replacement for gzip and bzip, supports multiple threads, and more than 20 compression levels. It also has a contributed optional feature called 'adaptive compression'. With this mode, the compression level is automatically tuned up and down as needed to keep the network pipe full, while doing as much compression as possible to save time. This prevents you from doing so much compression that it becomes a bottleneck, or missing out on compression you could be doing because the network is too slow.