How to delete files recursively from an S3 bucket

153,653

Solution 1

With the latest aws-cli python command line tools, to recursively delete all the files under a folder in a bucket is just:

aws s3 rm --recursive s3://your_bucket_name/foo/

Or delete everything under the bucket:

aws s3 rm --recursive s3://your_bucket_name

If what you want is to actually delete the bucket, there is one-step shortcut:

aws s3 rb --force s3://your_bucket_name

which will remove the contents in that bucket recursively then delete the bucket.

Note: the s3:// protocol prefix is required for these commands to work

Solution 2

This used to require a dedicated API call per key (file), but has been greatly simplified due to the introduction of Amazon S3 - Multi-Object Delete in December 2011:

Amazon S3's new Multi-Object Delete gives you the ability to delete up to 1000 objects from an S3 bucket with a single request.

See my answer to the related question delete from S3 using api php using wildcard for more on this and respective examples in PHP (the AWS SDK for PHP supports this since version 1.4.8).

Most AWS client libraries have meanwhile introduced dedicated support for this functionality one way or another, e.g.:

Python

You can achieve this with the excellent boto Python interface to AWS roughly as follows (untested, from the top of my head):

import boto
s3 = boto.connect_s3()
bucket = s3.get_bucket("bucketname")
bucketListResultSet = bucket.list(prefix="foo/bar")
result = bucket.delete_keys([key.name for key in bucketListResultSet])

Ruby

This is available since version 1.24 of the AWS SDK for Ruby and the release notes provide an example as well:

bucket = AWS::S3.new.buckets['mybucket']

# delete a list of objects by keys, objects are deleted in batches of 1k per
# request.  Accepts strings, AWS::S3::S3Object, AWS::S3::ObectVersion and 
# hashes with :key and :version_id
bucket.objects.delete('key1', 'key2', 'key3', ...)

# delete all of the objects in a bucket (optionally with a common prefix as shown)
bucket.objects.with_prefix('2009/').delete_all

# conditional delete, loads and deletes objects in batches of 1k, only
# deleting those that return true from the block
bucket.objects.delete_if{|object| object.key =~ /\.pdf$/ }

# empty the bucket and then delete the bucket, objects are deleted in batches of 1k
bucket.delete!

Or:

AWS::S3::Bucket.delete('your_bucket', :force => true)

Solution 3

You might also consider using Amazon S3 Lifecycle to create an expiration for files with the prefix foo/bar1.

Open the S3 browser console and click a bucket. Then click Properties and then LifeCycle.

Create an expiration rule for all files with the prefix foo/bar1 and set the date to 1 day since file was created.

Save and all matching files will be gone within 24 hours.

Just don't forget to remove the rule after you're done!

No API calls, no third party libraries, apps or scripts.

I just deleted several million files this way.

A screenshot showing the Lifecycle Rule window (note in this shot the Prefix has been left blank, affecting all keys in the bucket):

enter image description here

Solution 4

The voted up answer is missing a step.

Per aws s3 help:

Currently, there is no support for the use of UNIX style wildcards in a command's path arguments. However, most commands have --exclude "<value>" and --include "<value>" parameters that can achieve the desired result......... When there are multiple filters, the rule is the filters that appear later in the command take precedence over filters that appear earlier in the command. For example, if the filter parameters passed to the command were --exclude "*" --include "*.txt" All files will be excluded from the command except for files ending with .txt

aws s3 rm --recursive s3://bucket/ --exclude="*" --include="/folder_path/*" 

Solution 5

With s3cmd package installed on a Linux machine, you can do this

s3cmd rm s3://foo/bar --recursive

Share:
153,653
priya
Author by

priya

Updated on June 09, 2021

Comments

  • priya
    priya about 3 years

    I have the following folder structure in S3. Is there a way to recursively remove all files under a certain folder (say foo/bar1 or foo or foo/bar2/1 ..)

    foo/bar1/1/..
    foo/bar1/2/..
    foo/bar1/3/..
    
    foo/bar2/1/..
    foo/bar2/2/..
    foo/bar2/3/..
    
  • xis
    xis about 10 years
    Great idea for using Lifecycle instead of some delete command.
  • Ryan
    Ryan about 10 years
    Exactly, let S3 do it for you.
  • Indolering
    Indolering over 9 years
    You can also apply this to the entire bucket, enabling you to delete the bucket.
  • Scott Gartner
    Scott Gartner over 9 years
    Thanks for posting this answer, I was trying to do this exact thing and had put -Key "%_.Key" which doesn't work.
  • Don Cheadle
    Don Cheadle over 9 years
    should use the new aws cli like @number5 's answer below docs.aws.amazon.com/cli/latest/reference/s3/rm.html
  • Don Cheadle
    Don Cheadle over 9 years
    this should be the answer. It's a (new-ish) standard, powerful tool, designed for things just like this question
  • Naveen
    Naveen about 9 years
    This is deleting the files just fine but its also deleting the bucket after deleting the files. Did I miss anything?
  • number5
    number5 about 9 years
    @Naveen as I said above, rm will only delete files but rb --force will delete the files and the bucket.
  • Naveen
    Naveen about 9 years
    @number5 I used the following -> aws s3 rm --recursive s3://your_bucket_name and this deletes the bucket as well. I even tried aws s3 rm --recursive s3://your_bucket_name --exclude "" --include ".gz" and this did not help either.
  • Paul 'Joey' McMurdie
    Paul 'Joey' McMurdie almost 9 years
    According to the help it is either single-object delete s3cmd del s3://BUCKET/OBJECT or whole bucket delete s3cmd rb s3://BUCKET. There is no s3cmd rm, at least according to s3cmd --help.
  • oskarpearson
    oskarpearson over 8 years
    Perhaps it's worth updating the first example to show that you can specify a path? eg "aws s3 rm --recursive s3://your_bucket_name/foo/bar1"
  • Randy L
    Randy L about 8 years
    hoping i can use this to change the permissions on a set of objects! about to find out.
  • ryantuck
    ryantuck over 7 years
    using --recursive deletes the folder as well.
  • SuperUberDuper
    SuperUberDuper over 7 years
    how do you delete all your buckets?
  • Moseleyi
    Moseleyi over 7 years
    @RyanTuck do you know how to stop it from removing the folder?
  • ryantuck
    ryantuck over 7 years
    @Moseleyi i believe that you can't actually have an empty folder in an s3 bucket
  • lft93ryt
    lft93ryt over 6 years
    @schmijos You are right, it does not work for versioning. Further I did disable the versioning and tried the --force still no use. Any suggestions ?
  • number5
    number5 over 6 years
    @MarcellodeSales AWS S3 API (or aws cli) doesn't support that, it's on many people's wish list, so hopefully AWS will implement it someday
  • Marcello de Sales
    Marcello de Sales over 6 years
    @number5 How come this solution has been tagged by 82 people? LOL
  • number5
    number5 over 6 years
    @MarcellodeSales at least 82 people found it useful with non-versioned S3 buckets?
  • Vitaly Zdanevich
    Vitaly Zdanevich over 6 years
    Note that AWS Lambda does not have aws-cli preinstalled.
  • eco
    eco over 5 years
    NOTICE: You have to use aws s3 rm --recursive s3://bucket/ --exclude="" --include="/folder_path/" As per aws s3 help manual --include and exclude usage. Notice there has to be a STAR after the folder path
  • David Parks
    David Parks almost 5 years
    s3cmd rm is in the help as of 2019 (as an alias for del), this is an excellent answer. The aws cli tools only work against a / terminating prefix, but not a folder and partial filename prefix, whereas s3cmd works in both cases. This answer needs lots more upvotes, I had to scroll way too far to find the right solution.
  • Jivan
    Jivan about 4 years
    I can't find a more telling demonstration of what people don't like about Java than this answer...