Backup - rsync or tar
Solution 1
Definitely rsync
.
The advantage of rsync is that it will copy only the files which have changed.
If you have 100GB+ of relatively small files, you don't want to copy them all each time.
Note: the first backup with rsync
will be slow because all files are copied. Subsequently only the changed files are copied, and they can be compressed during the copy.
Be sure to familiarise yourself with all the options of rsync
... there are many.
Tar is an archive utility. You could conceivably create a tar file for the entire 100GB+, but you don't want to transfer it all, each time.
Solution 2
I would like to add that, although in general I agree with pavium's reply and I would choose rsync
, there are options in tar
for incremental backups. From man:
-g, --listed-incremental F create/list/extract new GNU-format incremental backup
-G, --incremental
create/list/extract old GNU-format incremental backup
EDIT: Following a recent comment, I will further expand on how both backups work:
tar
initially creates a large file, possibly compressed (-g
gzip flag) with all backed up files. Then each incremental backup creates a new file only with the modified files, in which it also specifies which of them have been deleted.
rsync
on the other hand initially keeps a second mirror directory with the exact tree and files of the source directory, uncompressed. Then with every incremental backup (-B
flag), it continues to have a mirror copy of the source, keeping in another directory by date all changed files (both modified and deleted).
Therefore, one can understand that each method has its plus and minus. A tar
backup is more difficult to be maintained in a medium with limited capacity, as it happens with the classic incremental method. rsync
is not considered a classic backup solution. It requires more disk space for the mirror, since it is uncompressed. It requires more time to reconstruct a full backup of a previous date.
UPDATE: Since Mar 2016 a newer alternative came up: borg backup. I very strongly recommend it. It uses the 'deduplicating' method. More information on the link provided above.
Solution 3
rsync can be somewhat painful if you have a very large number of files - especially if your rsync version is lower than 3. On the other hand: if you use tar, you would generate a very big resulting tar-file (unless the data may be compressed a lot). Personally, I would look at rdiff-backup, but make sure that you test your restore situation: rdiff-backup can be very memory demanding when restoring.
Solution 4
if your files do not change much - i would vote for rsync.
Solution 5
Do you need history (multiple backups) or just a plain copy of your data to some other disk? Backing up 100GB of 10KB files would take ages if you don't use a block level backup. Think about making block level snapshots or some other block level based solution, if you really need a fast solution.
Related videos on Youtube
![Admin](/assets/logo_square_200-5d0d61d6853298bd2a4fe063103715b4daf2819fc21225efa21dfb93e61952ea.png)
Admin
Updated on September 17, 2022Comments
-
Admin almost 2 years
We're looking to backup about 100gb+ of data containing small files (10kb+) each. The backup needs to be done as fast as possible to another harddrive weekly. Which is the better (especially speed wise) way to backup in such scenario? Rsync, or tar?
-
dasdasd over 11 yearsInformation about the files would be interesting. Are the existing ones static and only new ones are added or are all files prone to changes?
-
-
Christopher Batey over 14 yearsDo not need a history, just a plain copy of data to a secondary harddisk mounted on the server. Any suggestioon on a faster solution?
-
pfo over 14 years``dd=/dev/sdX of=/dev/sdY'' in a cron job should be the fastest solution since it's block level copy if sdX being copied to sdY. Benchmark that against a tar'ed or rsync'd copy.
-
Svish over 14 yearsCan you take a block level backup of just certain folders?
-
daff over 14 yearsBlock-level backups by design only work for entire filesystems, not single directories. This is especially true for simple solutions like "dd if=/dev/foo of=/dev/bar" but AFAIK also for the more advanced Snapshot-based products from NetApp, EMC and the like.
-
Tonny over 11 yearsFor the millionth time: RAID IS NOT BACKUP
-
Tonny over 11 yearsI did read and comprehend your answer. I am fully familiar with the in's and out's of BTRFS. (Hell, I've written parts of BTRFS.) You can use BTRFS with a combination of snapshots and RAID to achieve something that would act as a backup, but it is finicky, uses experimental features and IN GENERAL it is not a solution for a generic backup problem. I stand by my statement: RAID by itself is not backup. RAID+snapshot can be backup, but it is very hard to do it right. Just snapshot+blockbases image-copy of the snapshot is a much simpler approach to the posters question.
-
Yuan about 11 yearsIf computer strikes by thunder or fire, will your raid 1 backup surrive?
-
dasdasd about 11 years@Tonny: Since btrfs was only one of 2 provided options, I still do not understand the yelling and the downvote. Swapping out images from a softraid is done in a handfull of lines of script.
-
Nicolas Schmidt about 5 yearsWhy would you choose
rsync
overtar
if both have incremental backups? -
Wtower about 5 yearsI hope the recent edit covers your question.
-
asa about 3 yearswhy would you "definitely" use rsync if you could use tar.gz and combine it with find to create incremental compressed backups?
-
TheTechRobo Stands for Ukraine almost 3 years@asa more work i assume