AWS S3 copy files and folders between two buckets

149,589

Solution 1

Copy between S3 Buckets

AWS (just recently) released a command line interface for copying between buckets.

http://aws.amazon.com/cli/

$ aws s3 sync s3://mybucket-src s3://mybucket-target --exclude *.tmp
..

This will copy from one target bucket to another bucket.

See the documentation here : S3 CLI Documentation

Solution 2

A simplified example using the aws-sdk gem:

AWS.config(:access_key_id => '...', :secret_access_key => '...')
s3 = AWS::S3.new
s3.buckets['bucket-name'].objects['source-key'].copy_to('target-key')

If you want to perform the copy between different buckets, then specify the target bucket name:

s3.buckets['bucket-name'].objects['source-key'].copy_to('target-key', :bucket_name => 'target-bucket')

Solution 3

You can now do it from the S3 admin interface. Just go into one bucket select all your folders actions->copy. Then move into your new bucket actions->paste.

Solution 4

Copy between buckets in different regions

$ aws s3 cp s3://src_bucket/file  s3://dst_bucket/file --source-region eu-west-1 --region ap-northeast-1

The above command copies a file from a bucket in Europe (eu-west-1) to Japan (ap-northeast-1). You can get the code name for your bucket's region with this command:

$ aws s3api get-bucket-location --bucket my_bucket

By the way, using Copy and Paste in the S3 web console is easy, but it seems to download from the source bucket into the browser, and then upload to the destination bucket. Using "aws s3" was much faster for me.

Solution 5

It's possible with recent aws-sdk gem, see the code sample:

require 'aws-sdk'

AWS.config(
  :access_key_id     => '***',
  :secret_access_key => '***',
  :max_retries       => 10
)

file     = 'test_file.rb'
bucket_0 = {:name => 'bucket_from', :endpoint => 's3-eu-west-1.amazonaws.com'}
bucket_1 = {:name => 'bucket_to',   :endpoint => 's3.amazonaws.com'}

s3_interface_from = AWS::S3.new(:s3_endpoint => bucket_0[:endpoint])
bucket_from       = s3_interface_from.buckets[bucket_0[:name]]
bucket_from.objects[file].write(open(file))

s3_interface_to   = AWS::S3.new(:s3_endpoint => bucket_1[:endpoint])
bucket_to         = s3_interface_to.buckets[bucket_1[:name]]
bucket_to.objects[file].copy_from(file, {:bucket => bucket_from})

more details: How to copy file across buckets using aws-s3 gem

Share:
149,589
cnikolaou
Author by

cnikolaou

Solution Architect. Software Development. Startups. Ruby, Python & GoLang. Founding member of https://menadevs.com

Updated on December 16, 2020

Comments

  • cnikolaou
    cnikolaou over 3 years

    I have been on the lookout for a tool to help me copy content of an AWS S3 bucket into a second AWS S3 bucket without downloading the content first to the local file system.

    I have tried to use the AWS S3 console copy option but that resulted in some nested files being missing.

    I have tried to use Transmit app (by Panic). The duplicate command downloads the files first to the local system then uploads them back to the second bucket, which quite inefficient.

  • Arcolye
    Arcolye about 11 years
    Thanks for showing how to copy across servers. I'm trying to copy from us server to singapore server.
  • Anatoly
    Anatoly about 11 years
    @Arcolye how is latency in AWS Singapore now? It was slow and inconsistent a year ago.
  • Micah
    Micah almost 11 years
    I had a great experience with s3s3mirror. I was able to set it up on a m1.small EC2 node and copy 1.5 million objects in about 2 hours. Setup was a little tough, due to my unfamiliarity with Maven and Java, but it only took a few apt-get commands on Ubuntu to get everything installed. One last note: If (like me) you're worried about running an unknown script on a big, important s3 bucket, create a special user with read-only access on the copy-from bucket and use those credentials. Zero chance of accidental deletion.
  • Stew-au
    Stew-au over 10 years
    Ran it from EC2 and got 80MB copied across in about 5s.
  • odigity
    odigity about 10 years
    Exactly what I needed, since aws-sdk gem has no feature for copying or syncing a whole bucket at once. Thanks!
  • Giovanni Bitliner
    Giovanni Bitliner about 10 years
    It throws the following error A client error (PermanentRedirect) occurred when calling the ListObjects operation: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.
  • Layke
    Layke about 10 years
    @GiovanniBitliner The bucket name you are using is incorrect. You are either using the wrong prefix, or using the old way of referring to the bucket. Check your bucket name exactly in your admin console.
  • Jacob Foshee
    Jacob Foshee over 9 years
    Awesome! He is referring to the web interface. Unlike most of the others, I could do this from an iPad.
  • Adam Gawne-Cain
    Adam Gawne-Cain over 8 years
    The buckets could be in different S3 regions. I will add an answer showing how to copy between S3 buckets in different regions.
  • S..
    S.. about 8 years
    Note if this is your first time using the cli tool you need to run 'aws configure' and enter your creds
  • QuangDT
    QuangDT almost 8 years
    This randomly leaves out nested objects in subfolders - 3 years later and AWS still cannot fix such a basic bug!
  • hakkikonu
    hakkikonu about 7 years
    is it for same regions or all?
  • paul
    paul about 7 years
    Another downside is it also limits the number of objects you can copy to 100. If you try to use pagination and copy more, it removes the original set of objects from its "clipboard".
  • Taylor D. Edmiston
    Taylor D. Edmiston almost 7 years
    I'm also getting silently missing objects on a large nested paste. No errors or warnings or in process operations shown in the S3 dashboard.
  • fishjd
    fishjd over 6 years
    Note AWS offers two CLI type tools. The 'AWS CLI' and 'AWS Tools for Powershell' . This answer uses 'AWS CLI' . Don't be like me and install the wrong one.
  • Vishal
    Vishal over 6 years
    I have created a new bucket and in its actions "paste" options remains disable even though i have selected "copy" from actions of previous bucket. Could you please help me here?
  • MetalElf0
    MetalElf0 almost 6 years
    I can confirm this is not reliable for big copies. Tried copying a folder with ~ 1k subfolders, only 5 subfolders were actually copied, and the operation didn't show any error or warning.
  • davetapley
    davetapley almost 6 years
    Are these issues documented anywhere by Amazon? @RunLoop
  • QuangDT
    QuangDT almost 6 years
    @dukedave I don't know and have not tested again in quite a while as I resorted to doing the copying via the command line as that worked perfectly.
  • Djonatan
    Djonatan about 5 years
    The biggest problems comes related to this ticket github.com/aws/aws-cli/issues/901 aws s3 sync console will not manage to copy each objects ACL config. You have to find a way to make github.com/cobbzilla/s3s3mirror -C option work and/or set a bucket policy that would mirror the ACL in the source bucket objects. Also you can give it a try and specify tags for each object together with a bucket policy that will have an explicit condition on matching the tag for granting read access, for example.
  • Victor Schröder
    Victor Schröder over 4 years
    This is not javascript, sorry... (yes, I'm aware of coffeescript and that you can use it, still, not javascript)
  • Tapan Banker
    Tapan Banker over 4 years
    Amazon provides AWS CLI, a command line tool for interacting with AWS. With AWS CLI, that entire process took less than three seconds: $ aws s3 sync s3://<bucket>/<path> </local/path> For example aws s3 sync s3://s3.aws-cli.demo/photos/office ~/Pictures/work
  • Anthony Kong
    Anthony Kong about 4 years
    What is hdfs?