scp and compress at the same time, no intermediate save

164,918

Solution 1

There are many ways to do what you want. The simplest is to use a pìpe:

tar zcvf -  MyBackups | ssh user@server "cat > /path/to/backup/foo.tgz"

Here, the compression is being handled by tar which calls gzip (z flag). You can also use compress (Z) and bzip (j). For 7z, do this:

tar cf - MyBackups | 7za a -si -mx=9 -ms=on MyBackups.tar.7z | 
   ssh user@server "cat > /path/to/backup/foo.7z"

The best way, however, is probably rsync.

   Rsync is a fast and extraordinarily versatile  file  copying  tool.   It  can  copy
   locally, to/from another host over any remote shell, or to/from a remote rsync dae‐
   mon.  It offers a large number of options that control every aspect of its behavior
   and  permit  very  flexible  specification of the set of files to be copied.  It is
   famous for its delta-transfer algorithm, which reduces the amount of data sent over
   the network by sending only the differences between the source files and the exist‐
   ing files in the destination.  Rsync is widely used for backups and  mirroring  and
   as an improved copy command for everyday use.

rsync has way too many options. It really is worth reading through them but they are scary at first sight. The ones you care about in this context though are:

    -z, --compress              compress file data during the transfer
        --compress-level=NUM    explicitly set compression level

   -z, --compress
          With this option, rsync compresses the file data as it is sent to the desti‐
          nation machine, which reduces the amount of data being transmitted --  
          something that is useful over a slow connection.

          Note  that this option typically achieves better compression ratios than can
          be achieved by using a compressing remote shell or a  compressing  transport
          because  it takes advantage of the implicit information in the matching data
          blocks that are not explicitly sent over the connection.

So, in your case, you would want something like this:

rsync -z MyBackups user@server:/path/to/backup/

The files would be compressed while in transit and arrive decompressed at the destination.


Some more choices:

  • scp itself can compress the data

     -C      Compression enable.  Passes the -C flag to ssh(1) to
             enable compression.
    
    $ scp -C source user@server:/path/to/backup
    
  • There may be a way to get rsync and 7za to play nice but there is no point in doing so. The benefit of rsync is that it will only copy the bits that have changed between the local and remote files. However, a small local change can result in a very different compressed file so there is no point in using rsync for this. It just complicates matters with no benefit. Just use direct ssh as shown above. If you really want to do this, you can try by giving a subshell as an argument to rsync. On my system, I could not get this to work with 7za because it does not allow you to write compressed data to a terminal. Perhaps your implementation is different. Try something like (this does not work for me):

    rsync $(tar cf - MyBackups | 7za a -an -txz -si -so) \
      user@server:/path/to/backup
    
  • Another point is that 7z should not be used for backups on Linux. As stated on the 7z man page:

    DO NOT USE the 7-zip format for backup purpose on Linux/Unix because :
    - 7-zip does not store the owner/group of the file.

Solution 2

I think this command will do the trick

ssh user@host "cd /path/to/data/;tar zc directory_name" | tar zx 

Now, first of all you have to execute this command from the target host. And details to be explained:

  1. ssh user@host will open connection to host machine, from where the data is to be transfered.
  2. cd /path/to/data will take to the directory where required data is stored
  3. tar zc * will initiate compression and put it to the STDOUT
  4. Now pipe(|) will pipeline the STDOUT of the source to the STDIN of the destination where "tar zx " is running and continuously decompression data stream coming from source.

As you can see this command compresses on-the-fly and saves bandwidth. You can use other compressions as well for better results, but remember, compression and decompression needs CPU cycles.

Reference

Solution 3

Small improvement for the dkbhadeshiya's answer: you don't have to do cd dir, just specify working directory to the tar instead:

ssh user@host "tar -C /path/to/data/ -zc directory_name" | tar zx 

You can also upload directory the same way:

tar zc directory_name/ | ssh user@host "tar zx -C /new/path/to/data/"
Share:
164,918

Related videos on Youtube

JohnyMoraes
Author by

JohnyMoraes

Updated on September 18, 2022

Comments

  • JohnyMoraes
    JohnyMoraes over 1 year

    What is the canonical way to:

    • scp a file to a remote location
    • compress the file in transit (tar or not, single file or whole folder, 7za or something else even more efficient)
    • do the above without saving intermediate files

    I am familiar with shell pipes like this:

    tar cf - MyBackups | 7za a -si -mx=9 -ms=on MyBackups.tar.7z
    

    essentially:

    • rolling a whole folder into a single tar
    • pass data through stdout to stdin of the compressing program
    • apply aggressive compression

    What's the best way to do this over an ssh link, with the file landing on the remote filesystem?


    I prefer not to sshfs mount.


    This, does not work:

    scp <(tar cvf - MyBackups | 7za a -si -mx=9 -so) localhost:/tmp/tmp.tar.7z
    

    because:

    /dev/fd/63: not a regular file
    
  • JohnyMoraes
    JohnyMoraes about 11 years
    Thanks! I am going to accept this great answer but please, add a full, stand-alone command line that uses both rsync and 7za, with final output to the remote filesystem. I liked -z but I would like to decouple the compression stage so.. how would I use rsync in that case, please?
  • terdon
    terdon about 11 years
    @Robottinosino see updated answer. There is no point in using rsync with 7z. It should work with rsync and a subshel as shown but I could not figure out how anyway.
  • JohnyMoraes
    JohnyMoraes about 11 years
    I think that 7z and tar are a very powerful combination, I get with them better savings than with other compression algorithms. I am sure new and better ones will replace 7z.. but I don't understand why you would be against using the pair "in principle".. am I missing something?
  • user37931
    user37931 over 9 years
    +1 for scp -C. There wasn't enough room on the remote disk to hold the compressed file so I couldn't compress before transfer. One little command line option made my problem go away.
  • knutole
    knutole over 9 years
    How can I rsync a file, first zipping it, but then LEAVING it zipped on the other side? thanks
  • terdon
    terdon over 9 years
    @knutole just zip the file first, then rsync it. Please ask a new question if you need more details.
  • Gyscos
    Gyscos over 9 years
    As for 7zip and tar, 7zip is just an implementation of the lzma (and lzma2) compression. xz is another one, more convenient on linux. If xz is installed, you can actually call it directly from tar : tar cJf archive.tar.xz files
  • Eric Johnson
    Eric Johnson over 8 years
    How does scp -C compare with gzip?
  • Garren
    Garren over 7 years
    Transferring a large numbers of rather small files over relatively slow network resulted in well over 10x speed boost using rsync. scp -r took 44secs, scp -r -C took 39secs, rsync -r -d took 7sec, rsync -r -d -z took 3.3sec.