Write csv to google cloud storage

11,277

Solution 1

The blob.upload_from_string(data) method creates a new object whose contents are exactly the contents of the string data. It overwrites over existing objects rather than appending.

The easiest solution would be to write your whole CSV to a temporary file and then upload that file to GCS with the blob.upload_from_filename(filename) function.

Solution 2

Please refer to below answer, hope it helps.

import pandas as pd
 data = [['Alex','Feb',10],['Bob','jan',12]]
 df = pd.DataFrame(data,columns=['Name','Month','Age'])
 print df

Output

   Name Month  Age
0  Alex   Feb   10
1   Bob   jan   12

Add a row

row = ['Sally','Oct',15]
df.loc[len(df)] = row
print df

output

     Name Month  Age
 0   Alex   Feb   10
 1    Bob   jan   12
 2  Sally   Oct   15

write/copy to GCP Bucket using gsutil

  df.to_csv('text.csv', index = False)
 !gsutil cp 'text.csv' 'gs://BucketName/folderName/'

Python code (docs https://googleapis.dev/python/storage/latest/index.html )

from google.cloud import storage

def upload_to_bucket(bucket_name, blob_path, local_path):
    bucket = storage.Client().bucket(bucket_name)
    blob = bucket.blob(blob_path)
    blob.upload_from_filename(local_path)
    return blob.url

# method call
bucket_name = 'bucket-name' # do not give gs:// ,just bucket name
blob_path: = 'path/folder name inside bucket'
local_path = 'local_machine_path_where_file_resides' #local file path
upload_to_bucket(bucket_name, blob_path, local_path)

Solution 3

from google.cloud import storage
from oauth2client.client import GoogleCredentials
import os

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "<pathtomycredentials>"

a=[1,2,3]

b=['a','b','c']

storage_client = storage.Client()
bucket = storage_client.get_bucket("<mybucketname>")

blob=bucket.blob("Hummingbirds/trainingdata.csv")

# build up the complete csv string
csv_string_to_upload = ''

for eachrow in range(3):
    # add the lines
    csv_string_to_upload = csv_string_to_upload + str(a[eachrow]) + ',' + b[eachrow] + '\n'

# upload the complete csv string
blob.upload_from_string(
            data=csv_string_to_upload,
            content_type='text/csv'
        )
Share:
11,277
bw4sz
Author by

bw4sz

Updated on June 16, 2022

Comments

  • bw4sz
    bw4sz almost 2 years

    I am trying to understand how to write a multiple line csv file to google cloud storage. I'm just not following the documentation

    Close to here: Unable to read csv file uploaded on google cloud storage bucket

    Example:

    from google.cloud import storage
    from oauth2client.client import GoogleCredentials
    import os
    
    os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "<pathtomycredentials>"
    
    a=[1,2,3]
    
    b=['a','b','c']
    
    storage_client = storage.Client()
    bucket = storage_client.get_bucket("<mybucketname>")
    
    blob=bucket.blob("Hummingbirds/trainingdata.csv")
    
    for eachrow in range(3):
        blob.upload_from_string(str(a[eachrow]) + "," + str(b[eachrow]))
    

    That gets you a single line on google cloud storage

    3,c
    

    clearly it opened a new file each time and wrote the line.

    Okay, how about adding a new line delim?

    for eachrow in range(3):
        blob.upload_from_string(str(a[eachrow]) + "," + str(b[eachrow]) + "\n")
    

    that adds the line break, but again writes from the beginning.

    Can someone illustrate what the approach is? I could combine all my lines into one string, or write a temp file, but that seems very ugly.

    Perhaps with open as file?