Writing a file to S3 using Lambda in Python with AWS
Solution 1
Assuming Python 3.6. The way I usually do this is to wrap the bytes content in a BytesIO wrapper to create a file like object. And, per the boto3 docs you can use the-transfer-manager for a managed transfer:
from io import BytesIO
import boto3
s3 = boto3.client('s3')
fileobj = BytesIO(response.content)
s3.upload_fileobj(fileobj, 'mybucket', 'mykey')
If that doesn't work I'd double check all IAM permissions are correct.
Solution 2
You have a writable stream that you're asking boto3 to use as a readable stream which won't work.
Write the file, and then simply use bucket.upload_file() afterwards, like so:
s3 = boto3.resource('s3')
bucket = s3.Bucket('transportation.manifests.parsed')
with open('/tmp/output2.csv', 'w') as data:
data.write(response.content)
key = 'csv/' + key
bucket.upload_file('/tmp/output2.csv', key)
Related videos on Youtube
![tskittles](https://lh4.googleusercontent.com/-X4pJ0WLE1-A/AAAAAAAAAAI/AAAAAAAAAEY/jReh9Zk_WyE/photo.jpg?sz=256)
tskittles
Interested in the digital sphere and cookie dough ice-cream!
Updated on June 18, 2022Comments
-
tskittles about 2 years
In AWS, I'm trying to save a file to S3 in Python using a Lambda function. While this works on my local computer, I am unable to get it to work in Lambda. I've been working on this problem for most of the day and would appreciate help. Thank you.
def pdfToTable(PDFfilename, apiKey, fileExt, bucket, key): # parsing a PDF using an API fileData = (PDFfilename, open(PDFfilename, "rb")) files = {"f": fileData} postUrl = "https://pdftables.com/api?key={0}&format={1}".format(apiKey, fileExt) response = requests.post(postUrl, files=files) response.raise_for_status() # this code is probably the problem! s3 = boto3.resource('s3') bucket = s3.Bucket('transportation.manifests.parsed') with open('/tmp/output2.csv', 'rb') as data: data.write(response.content) key = 'csv/' + key bucket.upload_fileobj(data, key)
# FYI, on my own computer, this saves the file with open('output.csv', "wb") as f: f.write(response.content)
In S3, there is a bucket
transportation.manifests.parsed
containing the foldercsv
where the file should be saved.The type of
response.content
is bytes.From AWS, the error from the current set-up above is
[Errno 2] No such file or directory: '/tmp/output2.csv': FileNotFoundError.
In fact, my goal is to save the file to the csv folder under a unique name, sotmp/output2.csv
might not be the best approach. Any guidance?In addition, I've tried to use
wb
andw
instead ofrb
also to no avail. The error withwb
isInput <_io.BufferedWriter name='/tmp/output2.csv'> of type: <class '_io.BufferedWriter'> is not supported.
The documentation suggests that using 'rb' is the recommended usage, but I do not understand why that would be the case.Also, I've tried
s3_client.put_object(Key=key, Body=response.content, Bucket=bucket)
but receiveAn error occurred (404) when calling the HeadObject operation: Not Found
.-
AlasdairYou have
open('/tmp/output2.csv', 'rb')
but you are trying to write to the file. Note you probably don't have to create a temporary file. The bucket has aput_object
method you can use. -
AlasdairYou need to use
'w'
orwb
to write the file. The docs you link to are for uploading that file, which is a separate step. You haven't shown enough information to know whyput_object
failed. You already have the bucket so I would dobucket.put_object(Key=key, Body=response.content)
. If that doesn't work you should show the complete code you tried, and the full traceback.
-
-
Minerva almost 3 yearsI am trying to write an Avro file to S3. I am using DataFileWriter from Avro package. Let me if I could do that without having to use a temp file.
-
abigperson almost 3 yearsSorry I am not familiar with Avro. You could post this as a new question and I am sure it'll get some better attention that way!