Write csv to google cloud storage
Solution 1
The blob.upload_from_string(data)
method creates a new object whose contents are exactly the contents of the string data
. It overwrites over existing objects rather than appending.
The easiest solution would be to write your whole CSV to a temporary file and then upload that file to GCS with the blob.upload_from_filename(filename)
function.
Solution 2
Please refer to below answer, hope it helps.
import pandas as pd
data = [['Alex','Feb',10],['Bob','jan',12]]
df = pd.DataFrame(data,columns=['Name','Month','Age'])
print df
Output
Name Month Age
0 Alex Feb 10
1 Bob jan 12
Add a row
row = ['Sally','Oct',15]
df.loc[len(df)] = row
print df
output
Name Month Age
0 Alex Feb 10
1 Bob jan 12
2 Sally Oct 15
write/copy to GCP Bucket using gsutil
df.to_csv('text.csv', index = False)
!gsutil cp 'text.csv' 'gs://BucketName/folderName/'
Python code (docs https://googleapis.dev/python/storage/latest/index.html )
from google.cloud import storage
def upload_to_bucket(bucket_name, blob_path, local_path):
bucket = storage.Client().bucket(bucket_name)
blob = bucket.blob(blob_path)
blob.upload_from_filename(local_path)
return blob.url
# method call
bucket_name = 'bucket-name' # do not give gs:// ,just bucket name
blob_path: = 'path/folder name inside bucket'
local_path = 'local_machine_path_where_file_resides' #local file path
upload_to_bucket(bucket_name, blob_path, local_path)
Solution 3
from google.cloud import storage
from oauth2client.client import GoogleCredentials
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "<pathtomycredentials>"
a=[1,2,3]
b=['a','b','c']
storage_client = storage.Client()
bucket = storage_client.get_bucket("<mybucketname>")
blob=bucket.blob("Hummingbirds/trainingdata.csv")
# build up the complete csv string
csv_string_to_upload = ''
for eachrow in range(3):
# add the lines
csv_string_to_upload = csv_string_to_upload + str(a[eachrow]) + ',' + b[eachrow] + '\n'
# upload the complete csv string
blob.upload_from_string(
data=csv_string_to_upload,
content_type='text/csv'
)
bw4sz
Updated on June 16, 2022Comments
-
bw4sz almost 2 years
I am trying to understand how to write a multiple line csv file to google cloud storage. I'm just not following the documentation
Close to here: Unable to read csv file uploaded on google cloud storage bucket
Example:
from google.cloud import storage from oauth2client.client import GoogleCredentials import os os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "<pathtomycredentials>" a=[1,2,3] b=['a','b','c'] storage_client = storage.Client() bucket = storage_client.get_bucket("<mybucketname>") blob=bucket.blob("Hummingbirds/trainingdata.csv") for eachrow in range(3): blob.upload_from_string(str(a[eachrow]) + "," + str(b[eachrow]))
That gets you a single line on google cloud storage
3,c
clearly it opened a new file each time and wrote the line.
Okay, how about adding a new line delim?
for eachrow in range(3): blob.upload_from_string(str(a[eachrow]) + "," + str(b[eachrow]) + "\n")
that adds the line break, but again writes from the beginning.
Can someone illustrate what the approach is? I could combine all my lines into one string, or write a temp file, but that seems very ugly.
Perhaps with open as file?