Writing pandas dataframe to S3 bucket (AWS)

15,855

Solution 1

You can use boto3 package also for storing data to S3:

from io import StringIO  # python3 (or BytesIO for python2)
import boto3

bucket = 'info'  # already created on S3
csv_buffer = StringIO()
df.to_csv(csv_buffer)

s3_resource = boto3.resource('s3')
s3_resource.Object(bucket, 'df.csv').put(Body=csv_buffer.getvalue())

Solution 2

This

"s3.console.aws.amazon.com/s3/buckets/info/test.csv"

is not a S3 URI, you need to pass a S3 URI to save to s3. Moreover, you do not need to import s3fs (you only need it installed),

Just try:

import pandas as pd

df = pd.DataFrame()
# df.to_csv("s3://<bucket_name>/<obj_key>")

# In your case
df.to_csv("s3://info/test.csv")

NOTE: You need to create bucket on aws s3 first.

Share:
15,855
Jonas Palačionis
Author by

Jonas Palačionis

Data Scientist with a bachelor’s degree in architecture, Python lecturer at Vilnius Coding School, passionate and creative in problem-solving. Actively teaching myself machine learning and AI, driven by IT innovations and physics with a good taste of art and humor.

Updated on June 12, 2022

Comments

  • Jonas Palačionis
    Jonas Palačionis almost 2 years

    I have an AWS Lambda function which queries API and creates a dataframe, I want to write this file to an S3 bucket, I am using:

    import pandas as pd
    import s3fs
    
    df.to_csv('s3.console.aws.amazon.com/s3/buckets/info/test.csv', index=False)
    

    I am getting an error:

    No such file or directory: 's3.console.aws.amazon.com/s3/buckets/info/test.csv'

    But that directory exists, because I am reading files from there. What is the problem here?

    I've read the previous files like this:

    s3_client = boto3.client('s3')
    s3_client.download_file('info', 'secrets.json', '/tmp/secrets.json')
    

    How can I upload the whole dataframe to an S3 bucket?

  • Anton Pomieshchenko
    Anton Pomieshchenko about 4 years
    small notice. to make this work s3fs package should be installed.
  • null
    null about 4 years
    Yes, I didn't state it but of course, pandas would ask for it, I will add it to the answer
  • pc_pyr
    pc_pyr over 3 years
    Useful answer @null, in case AWS Lambda is used, how to install s3fs, thanks ?
  • null
    null over 3 years
    @pc_pyr you may find this page useful.