howto abort all incomplete multipart uploads for a bucket

13,080

Solution 1

Assuming you have your awscli all setup and it'll output JSON you can use jq to project the needed keys with:

BUCKETNAME=<xxx>
aws s3api list-multipart-uploads --bucket $BUCKETNAME \
| jq -r '.Uploads[] | "--key \"\(.Key)\" --upload-id \(.UploadId)"' \
| while read -r line; do
    eval "aws s3api abort-multipart-upload --bucket $BUCKETNAME $line";
done

Solution 2

If you are doing multipart uploading, you can do the cleanup form S3 Management console too.

a) Open your S3 bucket

b) Switch to Management Tab

c) Click Add Lifecycle Rule

d) Now type rule name on first step and check the Clean up incomplete multipart uploads checkbox. Now you an type the number of days to keep incomplete parts too.

That's it. You can see these steps in attached screen shot too.

Steps to add rule

Solution 3

Here is my oneliner, that will abort ALL multipart uploads regardless of status, assuming that you don't have any spaces in your key / filename.

BUCKETNAME=<xxx>;aws s3api list-multipart-uploads --bucket $BUCKETNAME --query 'Uploads[].[Key, UploadId]' --output text | awk  '{print "aws s3api abort-multipart-upload --upload-id "$2" --bucket $BUCKETNAME --key " $1 " & wait"}{}' | bash

Solution 4

You can set up lifecycle rules to automatically purge those after some amount of time. Here's a blog post demonstrating how to do it in the console:

https://aws.amazon.com/blogs/aws/s3-lifecycle-management-update-support-for-multipart-uploads-and-delete-markers/

To do this in boto3:

import boto3


s3 = boto3.client('s3')
try:
    lifecycle = s3.get_bucket_lifecycle(Bucket='bucket')
except ClientError:
    lifecycle = {'Rules': []}
lifecycle['Rules'].append({
    'ID': 'PruneAbandonedMultipartUploads',
    'Status': 'Enabled',
    'Prefix': '',
    'AbortIncompleteMultipartUpload': {
        'DaysAfterInitiation': 7
    }
})
s3.put_bucket_lifecycle(Bucket='bucket', LifecycleConfiguration=lifecycle)

Adding that configuration in the cli would be much the same:

$ aws s3api get-bucket-lifecycle --bucket bucket > lifecycle.json
# Edit the lifecycle, adding the same configuration as in the boto3 sample
$ aws s3api put-bucket-lifecycle --bucket bucket --lifecycle-configuration file://lifecycle.json

If you have no lifecycle policy on your bucket, get-bucket-lifecycle will raise a ClientError. A robust implementation would make sure the right error is returned.

A policy only with that configuration would look like so:

{
    "Rules": [
        {
            "ID": "PruneAbandonedMultipartUpload",
            "Status": "Enabled",
            "AbortIncompleteMultipartUpload": {
                "DaysAfterInitiation": 7
            }
        }
    ]
}

Solution 5

You can alternatively use Minio Client aka mc It is Open Source and compatible with AWS S3.

To list all the incomplete upload on a associated bucket.

$ mc ls -I s3/mybucketname

To remove all incomplete uploads to a associated S3 bucket.

$ mc rm -I -r --force s3/mybucketname

I = incomplete r = recursive f = with force option

Hope it helps.

Disclaimer : I work for Minio.

Share:
13,080

Related videos on Youtube

Jason
Author by

Jason

Updated on September 18, 2022

Comments

  • Jason
    Jason over 1 year

    Sometimes multipart uploads hang or don't complete for some reason. In that case you are stuck with orphaned parts that are tricky to remove. You can list them with:

    aws s3api list-multipart-uploads --bucket $BUCKETNAME
    

    I am looking for way to abort them all.

  • Himanshu Shekhar
    Himanshu Shekhar over 3 years
    Riz one question: If today(13 August) I set multipart cleanup rule (1 day). Then will this rule remove the object that is created before 1day (10 August)?