scp files from particular folder in parallel

5,281

Solution 1

Why do you think the parallel connection would make it faster? SCP is very simple tool to transfer few small files, nothing for throughput and performance. Running it from more parallel processes could make it little bit faster, but not significantly. What you can do on the other way is:

  • Use sftp to get better throughput and more clever copying (should be enough) for example using -r
  • Use ControlMaster to get rid of the overhead of parallel TCP connections
  • Use correct parameters to the parallel

I would start with the sftp:

sftp -r trinity@machineA:/data01/primary/ /data01/primary/

Solution 2

You need the * expansion to happen on the remote side:

ssh machineA 'parallel -j 5 scp {} machineB:/data01/primary/ ::: /data01/primary/*'
Share:
5,281

Related videos on Youtube

david
Author by

david

Updated on September 18, 2022

Comments

  • david
    david over 1 year

    I want to scp files from machineA into my machineB and this is how I am doing it. I am copying bunch of files one by one from primary folder of machineA to primary folder of machineB and secondary folder of machineA to secondary folder of machineB.

    trinity@machineB:~$ scp trinity@machineA:/data01/primary/* /data01/primary/
    trinity@machineB:~$ scp trinity@machineA:/data02/secondary/* /data02/secondary/
    

    Is there any way by which I can copy multiple files in parallel? Like five files at a time from a folder? So instead of copying one files at a time, I want to copy five files from primary or secondary folders respectively?

    Basically I want to copy whatever is there in primary and secondary folders of machineA into machineB parallely.

    I also have GNU Parallel installed on my box if I can use that. I tried below command but it doesn't work. I was expecting that it should copy 5 files in parallel at a time until everything gets copied from that folder.

    parallel -j 5 scp trinity@machineA:/data01/primary/* /data01/primary/
    

    Anything wrong with my parallel syntax? What is the best way by which I can copy five files in parallel from a remote folder until everything gets copied from it?

  • david
    david almost 7 years
    When I run above command on machineB, I am see this error message on the console - Host key verification failed. lost connection multilple times and eventually nothing gets copied. Any thoughts why? I can ssh perfectly fine from machineB to machineA.
  • Ole Tange
    Ole Tange almost 7 years
    Are you running as the same user? Because it sounds as if you have not accepted the host key. It might help to do scp -o StrictHostKeyChecking=no ....
  • david
    david almost 7 years
    I tried and I am still getting this error Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,password). lost connection. And I can ssh just fine from machineB to machineA.
  • Ole Tange
    Ole Tange almost 7 years
    Do you have to enter a passphrase when you ssh? In that case, please use ssh-agent, so you do not need to enter a passphrase. GNU Parallel assume you do not need to enter a passphrase.