How to check if file exists in Google Cloud Storage?
Solution 1
I guess there is no function to check directly if the file exists given its path.
I have created a function that uses the files.listdir()
API function to list all the files in the bucket and match it against the file name that we want. It returns true if found and false if not.
Solution 2
This post is old, you can actually now check if a file exists on GCP using the blob class, but because it took me a while to find an answer, adding here for the others who are looking for a solution
from google.cloud import storage
name = 'file_i_want_to_check.txt'
storage_client = storage.Client()
bucket_name = 'my_bucket_name'
bucket = storage_client.bucket(bucket_name)
stats = storage.Blob(bucket=bucket, name=name).exists(storage_client)
Documentation is here
Hope this helps!
Edit
As per the comment by @om-prakash, if the file is in a folder, then the name should include the path to the file:
name = "folder/path_to/file_i_want_to_check.txt"
Solution 3
It's as easy as use the exists method within a blob object:
from google.cloud import storage
def blob_exists(projectname, credentials, bucket_name, filename):
client = storage.Client(projectname, credentials=credentials)
bucket = client.get_bucket(bucket_name)
blob = bucket.blob(filename)
return blob.exists()
Solution 4
The answer provided by @nickthefreak is correct, and so is the comment by Om Prakash. One other note is that the bucket_name should not include gs://
in front or a /
at the end.
Piggybacking off @nickthefreak's example and Om Prakash's comment:
from google.cloud import storage
name = 'folder1/another_folder/file_i_want_to_check.txt'
storage_client = storage.Client()
bucket_name = 'my_bucket_name' # Do not put 'gs://my_bucket_name'
bucket = storage_client.bucket(bucket_name)
stats = storage.Blob(bucket=bucket, name=name).exists(storage_client)
stats will be a Boolean (True or False) depending on whether the file exists in the Storage Bucket.
(I don't have enough reputation points to comment, but I wanted to save other people some time because I wasted way too much time with this).
Solution 5
If you are looking for a solution in NodeJS, then here it is:
var storage = require('@google-cloud/storage')();
var myBucket = storage.bucket('my-bucket');
var file = myBucket.file('my-file');
file.exists(function(err, exists) {});
// If the callback is omitted, then this function return a Promise.
file.exists().then(function(data) {
var exists = data[0];
});
If you need more help, you can refer to this doc: https://cloud.google.com/nodejs/docs/reference/storage/1.5.x/File#exists
Tanvir Shaikh
Updated on July 09, 2022Comments
-
Tanvir Shaikh almost 2 years
I have a script where I want to check if a file exists in a bucket and if it doesn't then create one.
I tried using
os.path.exists(file_path)
wherefile_path = "/gs/testbucket"
, but I got a file not found error.I know that I can use the
files.listdir()
API function to list all the files located at a path and then check if the file I want is one of them. But I was wondering whether there is another way to check whether the file exists. -
AKs almost 6 yearsThough there is a file in the bucket this always returns 'False' for me.
-
Ardi Nusawan almost 6 years@AjitK'sagar me not. If file exist gcs will return url of file. Maybe your url is incorrect?
-
Om Prakash about 5 yearsAbove solution may not work if file exist in some folder in google cloud storage and not in root directory of cloud storage, do this instead
stats = storage.Blob(bucket=bucket, name="folder_1/another_folder_2/your_file.txt").exists(storage_client)
-
Adam Hughes about 4 yearsFor thousands of URLs this is slow. Is there anyway to submit a batch of key/buckets in one go?
-
David Valenzuela Urrutia almost 4 yearsThank you! This is exactly what I needed.
-
s2t2 almost 4 yearsalso seems error prone if the file is large (anecdotal).
urllib3.exceptions.ProtocolError: ('Connection aborted.', OSError(0, 'Error'))
-
Tudor over 3 yearsThis should be the accepted answer.
.exists()
does not require that extra arg. -
Dave Liu over 3 yearsOP specifically asked for Python
-
confiq about 3 yearsI'm not sure that is "modern solution". If you are using python, it's smarter to use GCP API with it.
-
confiq about 3 yearsit will also not work if the blob is a folder and not a file.
-
Jeremy Leipzig about 3 years
import cloudstorage as gcs gcs.open("gs://foo/foo.bar") AttributeError: module 'cloudstorage' has no attribute 'open'
-
bkanuka over 2 yearsDon't call out to a subprocess if native libaries exists