How to check if file exists in Google Cloud Storage?

45,481

Solution 1

I guess there is no function to check directly if the file exists given its path.
I have created a function that uses the files.listdir() API function to list all the files in the bucket and match it against the file name that we want. It returns true if found and false if not.

Solution 2

This post is old, you can actually now check if a file exists on GCP using the blob class, but because it took me a while to find an answer, adding here for the others who are looking for a solution

from google.cloud import storage

name = 'file_i_want_to_check.txt'   
storage_client = storage.Client()
bucket_name = 'my_bucket_name'
bucket = storage_client.bucket(bucket_name)
stats = storage.Blob(bucket=bucket, name=name).exists(storage_client)

Documentation is here

Hope this helps!

Edit

As per the comment by @om-prakash, if the file is in a folder, then the name should include the path to the file:

name = "folder/path_to/file_i_want_to_check.txt"

Solution 3

It's as easy as use the exists method within a blob object:

from google.cloud import storage

def blob_exists(projectname, credentials, bucket_name, filename):
   client = storage.Client(projectname, credentials=credentials)
   bucket = client.get_bucket(bucket_name)
   blob = bucket.blob(filename)
   return blob.exists()

Solution 4

The answer provided by @nickthefreak is correct, and so is the comment by Om Prakash. One other note is that the bucket_name should not include gs:// in front or a / at the end.

Piggybacking off @nickthefreak's example and Om Prakash's comment:

from google.cloud import storage

name = 'folder1/another_folder/file_i_want_to_check.txt'   

storage_client = storage.Client()
bucket_name = 'my_bucket_name'  # Do not put 'gs://my_bucket_name'
bucket = storage_client.bucket(bucket_name)
stats = storage.Blob(bucket=bucket, name=name).exists(storage_client)

stats will be a Boolean (True or False) depending on whether the file exists in the Storage Bucket.

(I don't have enough reputation points to comment, but I wanted to save other people some time because I wasted way too much time with this).

Solution 5

If you are looking for a solution in NodeJS, then here it is:

var storage = require('@google-cloud/storage')();
var myBucket = storage.bucket('my-bucket');

var file = myBucket.file('my-file');

file.exists(function(err, exists) {});

// If the callback is omitted, then this function return a Promise.
file.exists().then(function(data) {
  var exists = data[0];
});

If you need more help, you can refer to this doc: https://cloud.google.com/nodejs/docs/reference/storage/1.5.x/File#exists

Share:
45,481
Tanvir Shaikh
Author by

Tanvir Shaikh

Updated on July 09, 2022

Comments

  • Tanvir Shaikh
    Tanvir Shaikh almost 2 years

    I have a script where I want to check if a file exists in a bucket and if it doesn't then create one.

    I tried using os.path.exists(file_path) where file_path = "/gs/testbucket", but I got a file not found error.

    I know that I can use the files.listdir() API function to list all the files located at a path and then check if the file I want is one of them. But I was wondering whether there is another way to check whether the file exists.

  • AKs
    AKs almost 6 years
    Though there is a file in the bucket this always returns 'False' for me.
  • Ardi Nusawan
    Ardi Nusawan almost 6 years
    @AjitK'sagar me not. If file exist gcs will return url of file. Maybe your url is incorrect?
  • Om Prakash
    Om Prakash about 5 years
    Above solution may not work if file exist in some folder in google cloud storage and not in root directory of cloud storage, do this instead stats = storage.Blob(bucket=bucket, name="folder_1/another_folder_2/your_file.txt").exists(stora‌​ge_client)
  • Adam Hughes
    Adam Hughes about 4 years
    For thousands of URLs this is slow. Is there anyway to submit a batch of key/buckets in one go?
  • David Valenzuela Urrutia
    David Valenzuela Urrutia almost 4 years
    Thank you! This is exactly what I needed.
  • s2t2
    s2t2 almost 4 years
    also seems error prone if the file is large (anecdotal). urllib3.exceptions.ProtocolError: ('Connection aborted.', OSError(0, 'Error'))
  • Tudor
    Tudor over 3 years
    This should be the accepted answer. .exists() does not require that extra arg.
  • Dave Liu
    Dave Liu over 3 years
    OP specifically asked for Python
  • confiq
    confiq about 3 years
    I'm not sure that is "modern solution". If you are using python, it's smarter to use GCP API with it.
  • confiq
    confiq about 3 years
    it will also not work if the blob is a folder and not a file.
  • Jeremy Leipzig
    Jeremy Leipzig about 3 years
    import cloudstorage as gcs gcs.open("gs://foo/foo.bar") AttributeError: module 'cloudstorage' has no attribute 'open'
  • bkanuka
    bkanuka over 2 years
    Don't call out to a subprocess if native libaries exists