Boto script to download latest file from s3 bucket

12,156

Solution 1

You could list all of the files in the bucket and find the one with the most recent one (using the last_modified attribute).

>>> import boto
>>> c = boto.connect_s3()
>>> bucket = c.lookup('mybucketname')
>>> l = [(k.last_modified, k) for k in bucket]
>>> key_to_download = sorted(l, cmp=lambda x,y: cmp(x[0], y[0]))[-1][1]
>>> key_to_download.get_contents_to_filename('myfile')

Note, however, that this would be quite inefficient in you had lots of files in the bucket. In that case, you might want to consider using a database to keep track of the files and dates to make querying more efficient.

Solution 2

To add to @garnaat's answer, you may be able to address the inefficiency by using prefix to reduce the matched files. Instead of c.lookup, this example would only search files in the subdir subbucket that start with file_2014_:

>>> import boto
>>> c = boto.connect_s3()
>>> bucket = c.get_bucket('mybucketname')
>>> bucket_files = bucket.list('subdir/file_2014_')
>>> l = [(k.last_modified, k) for k in bucket_files]
>>> key_to_download = sorted(l, cmp=lambda x,y: cmp(x[0], y[0]))[-1][1]
>>> key_to_download.get_contents_to_filename('target_filename')

Solution 3

S3 launched versioning functionality of files in bucket http://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html.

You could get latest n files by calling s3client.listVersions(request) and specifying n if you want.See http://docs.aws.amazon.com/AmazonS3/latest/dev/list-obj-version-enabled-bucket.html

Example is in java. Not sure if boto added API for versioning.

Share:
12,156
user1386776
Author by

user1386776

Updated on July 28, 2022

Comments

  • user1386776
    user1386776 almost 2 years

    I like to write a boto python script to download the recent most file from the s3 bucket i.e. for eg I have 100 files in a s3 bucket I need to download the recent most uploaded file in it.

    Is there a way to download the recent most modified file from S3 using python boto.