rsync is very slow (factor 8 to 10) compared to cp on copying files from nfs-share to local dir

54,320

Solution 1

I think these differences are fairly well established between cp and rsync. See this article as a reference, titled: A look at rsync performance.

excerpt:
The four commands tested were:

    rsync $SRC $DEST
    echo $SRC | cpio -p $DEST
    cp  $SRC $DEST
    cat $SRC > $DEST/$SRC

The results for rsync, cpio, cp, and cat were:

user    sys     elapsed hog MiB/s   test
5.24    77.92   101.86  81% 100.53  cpio
0.85    53.77   101.12  54% 101.27  cp
1.73    59.47   100.84  60% 101.55  cat
139.69  93.50   280.40  83% 36.52   rsync

I use rsync on a daily basis. There are things you can do to improve the situation.

For example you can try using the -W switch:

-W, --whole-file            copy files whole (w/o delta-xfer algorithm)

Also I would suggest making sure you have the 3.x versions of rsync. There were noticeable improvements when we moved up to the newer versions.

Solution 2

The way to make rsync have the same performance as cp is to spell it "cp".

The difference between the two commands is significant even though the net effect may be the same. In particular, rsync does a bunch of reading to see whether or not some file or part of a file should be copied.

Is there some reason that you want to use rsync? Because cp copies "blindly" you will see higher raw performance. If, for a set of triggering conditions, the "delta-transfer" mechanism of rsync is used, you'll see transfer rates drop and CPU use to rise pretty much in the manner you report.

Share:
54,320

Related videos on Youtube

soulpath
Author by

soulpath

Updated on September 18, 2022

Comments

  • soulpath
    soulpath almost 2 years

    I have a freshly installed Ubuntu-server which is ought to be the new backup-server for our VM-storage. The server has 4 nics, 2 of them 10Gbit (in fact an intel x540-T2 with the newest driver available) which are used to connect to the SAN. I have the nfs-share mounted locally and compared speed-differences while copying a directory with ~30 files, around 15 vm-images and corresponding log files. The Images are between 8 GB and 600 GB in size.

    Using:

    cp -rf /mnt/nfs-share /backup-storage/
    

    bmon shows consequently around 600 MiB/s.

    Using

    rsync -av /mnt/nfs-share /backup-storage/
    

    bmon shows some packets in the first seconds, halts for about 30 seconds and than builds up to about 60-75 MiB/s. CPU is around 60%.

    What should/could I change to use rsync with the same performance as cp?

  • soulpath
    soulpath almost 11 years
    I'm aware of the behaviour, but didn't expect such an effect. I thought that, given CPU-Power and IOPS rsync should be perform at least at 300 MiB/s, espacieally if the file to copy doesn't exist. I've not finished testing yet. The backup with rsync would be more convinient, but I can also write a script using cp, dd or whatever comes to mind. Now I want to test various possibilities on different filesystems to evaluate what suits best.
  • soulpath
    soulpath almost 11 years
    I wasn't in doubt about reality, just about rsync- but due to these differences I'll go with writing a script using cp and some checksums. Thanks for your advice!
  • Giacomo Catenazzi
    Giacomo Catenazzi over 8 years
    No, just don't use rsync on a networked file systems. Your computer need to download the entire file, so you lose all the advantage of rsync.
  • roaima
    roaima over 6 years
    Sadly this answer is wrong in its detail. When copying between "local" filesystems (and yes, an NFS mount is a local filesystem in this context), rsync does not read the target file when copying unless you explicitly enable this counterproductive operation with --whole-file. In this situation it's just like a very slow cp.
  • AdminBee
    AdminBee over 4 years
    Welcome to the site, and thank you for your contribution. As for your comments on rsync, please note however that unless the -c option is used, rsync also will include/skip files purely based on modification time and size, so this should not be the reason for the performance overhead. It does, however, verify the written files against the source based on checksums to detect corruption on transfer; this will however impose only a small performance penalty as the checksum calcilation is performed en passant while reading the files for data transfer.