Boto3 S3, sort bucket by last modified

50,298

Solution 1

I did a small variation of what @helloV posted below. its not 100% optimum, but it gets the job done with the limitations boto3 has as of this time.

s3 = boto3.resource('s3')
my_bucket = s3.Bucket('myBucket')
unsorted = []
for file in my_bucket.objects.filter():
   unsorted.append(file)

files = [obj.key for obj in sorted(unsorted, key=get_last_modified, 
    reverse=True)][0:9]

Solution 2

If there are not many objects in the bucket, you can use Python to sort it to your needs.

Define a lambda to get the last modified time:

get_last_modified = lambda obj: int(obj['LastModified'].strftime('%s'))

Get all objects and sort them by last modified time.

s3 = boto3.client('s3')
objs = s3.list_objects_v2(Bucket='my_bucket')['Contents']
[obj['Key'] for obj in sorted(objs, key=get_last_modified)]

If you want to reverse the sort:

[obj['Key'] for obj in sorted(objs, key=get_last_modified, reverse=True)]

Solution 3

it seems that is no way to do the sort by using boto3. According to the documentation, boto3 only supports these methods for Collections:

all(), filter(**kwargs), page_size(**kwargs), limit(**kwargs)

Hope this help in some way. https://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.ServiceResource.buckets

Solution 4

Slight improvement of above:

import boto3

s3 = boto3.resource('s3')
my_bucket = s3.Bucket('myBucket')
files = my_bucket.objects.filter()
files = [obj.key for obj in sorted(files, key=lambda x: x.last_modified, 
    reverse=True)]

Solution 5

To get the last modified files in a folder in S3:

import boto3

s3 = boto3.resource('s3')
my_bucket = s3.Bucket('bucket_name')
files = my_bucket.objects.filter(Prefix='folder_name/subfolder_name/')
files = [obj.key for obj in sorted(files, key=lambda x: x.last_modified,
    reverse=True)][0:2]

print(files)

To get the two files which are last modified:

files = [obj.key for obj in sorted(files, key=lambda x: x.last_modified,
    reverse=True)][0:2]
Share:
50,298
nate
Author by

nate

I build cool things

Updated on January 23, 2021

Comments

  • nate
    nate over 3 years

    I need to fetch a list of items from S3 using Boto3, but instead of returning default sort order (descending) I want it to return it via reverse order.

    I know you can do it via awscli:

    aws s3api list-objects --bucket mybucketfoo --query "reverse(sort_by(Contents,&LastModified))"
    

    and its doable via the UI console (not sure if this is done client side or server side)

    I cant seem to see how to do this in Boto3.

    I am currently fetching all the files, and then sorting...but that seems overkill, especially if I only care about the 10 or so most recent files.

    The filter system seems to only accept the Prefix for s3, nothing else.