Reading data from S3 using Lambda

python json amazon-web-services amazon-s3 aws-lambda

83,703

Solution 1

You can use bucket.objects.all() to get a list of the all objects in the bucket (you also have alternative methods like filter, page_sizeand limit depending on your need)

These methods return an iterator with S3.ObjectSummary objects in it, from there you can use the method object.get to retrieve the file.

Solution 2

s3 = boto3.client('s3')
response = s3.get_object(Bucket=bucket, Key=key)
emailcontent = response['Body'].read().decode('utf-8')

83,703

Author by

LearningSlowly

PhD Student Civil engineer now lost in the world of computers.

Updated on July 09, 2022

Comments

LearningSlowly almost 2 years
I have a range of json files stored in an S3 bucket on AWS.

I wish to use AWS lambda python service to parse this json and send the parsed results to an AWS RDS MySQL database.

I have a stable python script for doing the parsing and writing to the database. I need to lambda script to iterate through the json files (when they are added).

Each json file contains a list, simple consisting of results = [content]

In pseudo-code what I want is:
1. Connect to the S3 bucket (jsondata)
2. Read the contents of the JSON file (results)
3. Execute my script for this data (results)
I can list the buckets I have by:
```
import boto3

s3 = boto3.resource('s3')

for bucket in s3.buckets.all():
    print(bucket.name)
```
Giving:
```
jsondata
```
But I cannot access this bucket to read its results.

There doesn't appear to be a read or load function.

I wish for something like
```
for bucket in s3.buckets.all():
   print(bucket.contents)
```
EDIT

I am misunderstanding something. Rather than reading the file in S3, lambda must download it itself.

From here it seems that you must give lambda a download path, from which it can access the files itself
```
import libraries

s3_client = boto3.client('s3')

def function to be executed:
   blah blah

def handler(event, context):
    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key'] 
        download_path = '/tmp/{}{}'.format(uuid.uuid4(), key)
        s3_client.download_file(bucket, key, download_path)
```
ScottMcC over 6 years

Should also note that you need to create an s3 object to use in your response. i.e. s3 = boto3.client('s3')