Read excel file from S3 into Pandas DataFrame
10,229
Solution 1
It is perfectly normal! obj is a dictionnary, have u tried ?
df = pd.read_excel(obj['body'], header=2)
Solution 2
Pandas now supports s3 URL as a file path so it can read the excel file directly from s3 without downloading it first.
See here for a CSV example - https://stackoverflow.com/a/51777553/52954
Solution 3
try pd.read_excel(obj['Body'].read())
Author by
Raj
Updated on June 27, 2022Comments
-
Raj almost 2 years
I have a SNS notification setup that triggers a Lambda function when a .xlsx file is uploaded to S3 bucket.
The lambda function reads the .xlsx file into Pandas DataFrame.
import os import pandas as pd import json import xlrd import boto3 def main(event, context): message = event['Records'][0]['Sns']['Message'] parsed_message = json.loads(message) src_bucket = parsed_message['Records'][0]['s3']['bucket']['name'] filepath = parsed_message['Records'][0]['s3']['object']['key'] s3 = boto3.resource('s3') s3_client = boto3.client('s3') obj = s3_client.get_object(Bucket=src_bucket, Key=filepath) print(obj['Body']) df = pd.read_excel(obj, header=2) print(df.head(2))
I get an error as below:
Invalid file path or buffer object type: <type 'dict'>: ValueError Traceback (most recent call last): File "/var/task/handler.py", line 26, in main df = pd.read_excel(obj, header=2) File "/var/task/pandas/util/_decorators.py", line 178, in wrapper return func(*args, **kwargs) File "/var/task/pandas/util/_decorators.py", line 178, in wrapper return func(*args, **kwargs) File "/var/task/pandas/io/excel.py", line 307, in read_excel io = ExcelFile(io, engine=engine) File "/var/task/pandas/io/excel.py", line 376, in __init__ io, _, _, _ = get_filepath_or_buffer(self._io) File "/var/task/pandas/io/common.py", line 218, in get_filepath_or_buffer raise ValueError(msg.format(_type=type(filepath_or_buffer))) ValueError: Invalid file path or buffer object type: <type 'dict'>
How can I resolve this?
-
Raj over 5 yearsThat was it. df = pd.read_excel(obj['body'], header=2). Your post is missing closing ] for 'body'. Thank you for the help.
-
Tarik Elkalai over 5 yearsMy pleasure :) P.S: I have added the ]