reading and writing excel files from s3 using boto3 in lambda

11,407

Solution 1

Boto provides the copy_from function to directly copy an object to another location. This avoids the need to download the file manually.

target_object.copy_from(CopySource='from_bucket/from_file')

You can use that or make sure the file you're reading from is open and positioned at the first byte. In the snippet above, the file was already closed after the with statement.

with open('/tmp/file', 'rb') as file:
    target_object.put(Body=file)

Or reuse the same file handle by seeking to the beginning:

file.seek(0)
target_object.put(Body=file)

Solution 2

The temporary file you are storing in local, you are not referencing it. Following code should work for you.

import boto3
import botocore
import io
 def lambda_handler(event, context):
  s3 = boto3.resource('s3')
   s3.Bucket('<first_bucket>').download_file('<file_name>.xlsx', '/tmp/<file_name>.xlsx')

//upload start from here

s3 = boto3.resource('s3')
s3.meta.client.upload_file('/tmp/<file_name>.xlsx', '<second_bucket>', '/path/to/bucket/<file_name>.xlsx')
return 'Successfully written to new bucket'
Share:
11,407
Sujay DSa
Author by

Sujay DSa

Updated on June 04, 2022

Comments

  • Sujay DSa
    Sujay DSa almost 2 years

    I'm trying to read an excel file from one s3 bucket and write it into another bucket using boto3 in aws lambda. I've provided full s3 access to my role and have written the following code

    import boto3
    import botocore
    import io
    def lambda_handler(event, context):
        s3 = boto3.resource('s3')
        s3.Bucket('<first_bucket>').download_file('<file_name>.xlsx', '/tmp/<file_name>.xlsx')
        object = s3.Object('<first_bucket>','<file_name>.xlsx')
        with open('/tmp/<file_name>', 'wb') as data:
            object.download_fileobj(data)
        target_object =  s3.Object('<second_bucket>','<file_name>.xlsx')
        target_object.put(data)
    
    
        return 'Successfully written to new bucket'
    

    I executed this code in Lambda and when I check my second bucket I can see that the file is present but it's size is 0. I'm not sure why and how to correct this. Any pointers?

  • Sujay DSa
    Sujay DSa almost 6 years
    Your earlier suggestion to use target_object.put(Body = open('/tmp/file', 'rb')) is working however, your current answer still gives me file of size 0.
  • jspcal
    jspcal almost 6 years
    Try with Body=file
  • Sujay DSa
    Sujay DSa almost 6 years
    Thanks. Works nicely now