ZFS - Recover or repair a corrupted file in a snapshot from backup?
Solution 1
Here's my slightly generalized solution:
sudo cp /tank2/test-text-file /tank1/test-text-file
sudo zfs snapshot tank1@snapshot3
sudo sh -c 'zfs send -i tank1@snapshot2 tank1@snapshot3 | zfs receive -F tank2'
sudo zfs rollback -r tank1@snapshot1
sudo sh -c 'zfs send -i tank2@snapshot1 tank2@snapshot3 | zfs receive -F tank1'
sudo zpool scrub tank1; sudo zpool status -v tank1
And assuming there are no other errors reported:
sudo zpool clear tank1
The reason why I created snapshot3
wasn't because it was needed for my (extremely contrived) example but because it's probably a good habit to develop (and I originally wanted to test that it would work, as I'd hoped). If there were any other changes on tank1
since snapshot2
, I'd ideally like to not lose them to recover test-text-file
.
Solution 2
It's always better to use redundant pools instead of non-redundant pools (though not always possible). The issue above is not likely to happen on a redundant pool. And it's faster to clone a snapshot (to get a file from it) than to recreate it somewhere (if you, of course, have no complaints about faulty hardware).
Related videos on Youtube
Kenny Evitt
I make computers behave interestingly for fun and profit.
Updated on September 18, 2022Comments
-
Kenny Evitt almost 2 years
A pool has suffered permanent data corruption to file data that's part of a snapshot. If the file data was part of the filesystem (and not part of any snapshot), I could simply recover the file from a suitable backup copy. How can I recover or repair (and clear errors reported by ZFS for) a file in a snapshot from a copy of the snapshot or a (partial1) copy of the pool?
1 Where the partial copy contains at least the affected snapshot and the previous snapshot also on the affected pool.
Example
Here's an easy-to-reproduce tho extremely contrived example:
From a (bash) shell prompt:
cd mkdir zfs-test for i in {1..2}; do dd if=/dev/zero of=zfs-test/tank-file$i bs=1G count=1 &> /dev/null; done sudo zpool create tank1 ~/zfs-test/tank-file1 sudo zpool create tank2 ~/zfs-test/tank-file2 sudo zfs snapshot tank1@snapshot1 sudo sh -c 'zfs send tank1@snapshot1 | zfs receive -F tank2'
Create a text file /tank1/test-text-file with content that you can easily find in a hex editor. Here's what I used:
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus. Phasellus viverra nulla ut metus varius laoreet. Quisque rutrum. Aenean imperdiet. Etiam ultricies nisi vel augue. Curabitur ullamcorper ultricies nisi. Nam eget dui.
Again from a shell prompt:
sudo zfs snapshot tank1@snapshot2 sudo sh -c 'zfs send -i tank1@snapshot1 tank1@snapshot2 | zfs receive -F tank2'
Now you need to corrupt the file data. I used ht and I searched for "dui" and changed it to "duh".
You can confirm that the data is corrupted:
sudo zpool scrub tank1; sudo zpool status -v tank1 pool: tank1 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://zfsonlinux.org/msg/ZFS-8000-8A scan: scrub repaired 0 in 0h0m with 1 errors on Sun Jan 11 20:16:30 2015 config: NAME STATE READ WRITE CKSUM tank1 ONLINE 0 0 1 /home/kenny/zfs-test/tank-file1 ONLINE 0 0 2 errors: Permanent errors have been detected in the following files: tank1@snapshot2:/test-text-file
-
Kenny Evitt over 9 yearsI agree that, all else equal, it's better to use redundant pools. But data corruption is still possible even with redundant pools – surprisingly likely based on posts on the ZFS on Linux user discussion mailing group. And I understand how to retrieve a file from a snapshot; that's straightforward. But how can I repair bad data in a snapshot? Most of the advice I've seen is to delete the snapshot to clear the error but I'm curious about alternatives that preserve the snapshot.
-
ewwhite over 9 years@KennyEvitt I can't think of a situation where I'd want to repair the data inside of a snapshot. It sounds like a narrow use case.
-
Kenny Evitt over 9 years@ewwhite maybe I'm confused, but, based on my example, file data my be corrupted and, if it hasn't been changed since the last snapshot, the file in the snapshot will show as corrupted. Why wouldn't you want to repair that? Or why would it be a narrow use case to repair it?
-
Chris Seufert almost 6 yearsSo I have a USB drive, that I periodically zfs send-recv our fileserver file system on to, and its developed a single corrupt file, however, I cant change the file on the backup drive, as the next zfs send-recv will remove the non-corrupt file. So how do i fix the data error? (I realise i can just drop the zfs filesystem and mirror it again, but hoping not to have to).