rsync an already compressed file
Solution 1
Compressing in transit an already-compressed file is usually not worth the CPU time. There are caveats. In the process of comparing two files, using rsync with compression can speed up the comparison of hashes of the data.
If you only want to sync compressed versions of large files on more than one system, one place to look would be certain builds of gzip. On an Ubuntu system, I get:
$ gzip -h Usage: gzip [OPTION]... [FILE]... Compress or uncompress FILEs (by default, compress FILES in-place). Mandatory arguments to long options are mandatory for short options too. -c, --stdout write on standard output, keep original files unchanged -d, --decompress decompress -f, --force force overwrite of output file and compress links -h, --help give this help -l, --list list compressed file contents -L, --license display software license -n, --no-name do not save or restore the original name and time stamp -N, --name save or restore the original name and time stamp -q, --quiet suppress all warnings -r, --recursive operate recursively on directories -S, --suffix=SUF use suffix SUF on compressed files -t, --test test compressed file integrity -v, --verbose verbose mode -V, --version display version number -1, --fast compress faster -9, --best compress better --rsyncable Make rsync-friendly archive With no FILE, or when FILE is -, read standard input. Report bugs to .
Notice that --rsyncable
option? It avoids using adaptive compression so that only small pieces of the compressed file are changed when there's only a small change to the source file. The remainder of the binary data is unchanged so that rsync won't need to retransmit the whole thing. The man page indicates that this option shouldn't increase the size of the compressed file by more than around 1% compared to without using the option, and that gunzip won't know the difference.
I have a 468MB sql file that I compressed to 57MB with the --rsyncable
option. I transfer this file to my local system. Then I add a one line comment to the original sql file on the remote system, and recompress with the rsyncable option.
$ rsync -avvz --progress -h fooboo:foo.sql.gz . opening connection using ssh fooboo rsync --server --sender -vvlogDtprz . foo.sql.gz receiving file list ... 1 file to consider delta-transmission enabled foo.sql.gz 59.64M 100% 43.22MB/s 0:00:01 (xfer#1, to-check=0/1) total: matches=7723 hash_hits=9468 false_alarms=0 data=22366 sent 54.12K bytes received 22.58K bytes 17.05K bytes/sec total size is 59.64M speedup is 777.59
Not bad. Rsync only had to transfer a small amount of the newer compressed file.
Solution 2
rsync will not make an already compressed file significantly smaller during transit.
It is unlikely that your failed transfers will be fixed by adding the -z flag. I would suggest trying to rsync the file(s) uncompressed. rsync will then compress on the fly. You then have the advantage that should the source file change and you need to rsync again, only the changed bytes will be transferred. If you change a compressed file rsync will most likely have to retransmit it in its entirety. See here for more details:
http://beeznest.wordpress.com/2005/02/03/rsyncable-gzip/
Solution 3
Using rsync -z
will not have any advantage over just rsync
when dealing with a file that has already been compressed using a good compression format. However, you might consider splitting your compressed file into smaller pieces, so you are able to transmit it using rsync.
Here is a guide for linux: http://www.techiecorner.com/107/how-to-split-large-file-into-several-smaller-files-linux/ And for Windows: http://www.online-tech-tips.com/computer-tips/how-to-split-a-large-file-into-multiple-smaller-pieces/
Related videos on Youtube
ben
Updated on September 18, 2022Comments
-
ben almost 2 years
will rysnc -z have any compression advantage if the input file is already gzipped? I have a large 100GB compressed file to send over the network across servers and it consistently failed(broken pipe) after various amount of time. Wondering if I should try the -z flag.
-
FauxFaux over 10 yearsI suspect you were looking for the
--partial
option, which allows resumption of the transfer, regardless of what went wrong.
-
-
Raza almost 11 yearsIt would be nice to compare what the transfer would use without the
--rsyncable
option.