How to count number of file in a bucket-folder with gsutil
Solution 1
The gsutil ls command with options -l
(long listing) and -R
(recursive listing) will list the entire bucket recursively and then produce a total count of all objects, both files and directories, at the end:
$ gsutil ls -lR gs://pub
104413 2011-04-03T20:58:02Z gs://pub/SomeOfTheTeam.jpg
172 2012-06-18T21:51:01Z gs://pub/cloud_storage_storage_schema_v0.json
1379 2012-06-18T21:51:01Z gs://pub/cloud_storage_usage_schema_v0.json
1767691 2013-09-18T07:57:42Z gs://pub/gsutil.tar.gz
2445111 2013-09-18T07:57:44Z gs://pub/gsutil.zip
1136 2012-07-19T16:01:05Z gs://pub/gsutil_2.0.ReleaseNotes.txt
... <snipped> ...
gs://pub/apt/pool/main/p/python-socksipy-branch/:
10372 2013-06-10T22:52:58Z gs://pub/apt/pool/main/p/python-socksipy-branch/python-socksipy-branch_1.01_all.deb
gs://pub/shakespeare/:
84 2010-05-07T23:36:25Z gs://pub/shakespeare/rose.txt
TOTAL: 144 objects, 102723169 bytes (97.96 MB)
If you really just want the total, you can pipe the output to the tail
command:
$ gsutil ls -lR gs://pub | tail -n 1
TOTAL: 144 objects, 102723169 bytes (97.96 MB)
UPDATE
gsutil now has a du command. This makes it even easier to get a count:
$ gsutil du gs://pub | wc -l
232
Solution 2
If you have the option to not use gsutil, the easiest way is to check it on Google Cloud Platform. Go to Monitoring > Metrics explorer :
- Resource type : GCS Bucket
- Metric : Object count Then, in the table below, you have for each bucket the number of document it contains.
Solution 3
You want to gsutil ls -count -recursive
in gs://bucket/folder
?
Alright; gsutil ls gs://bucket/folder/**
will list just full urls of the paths to files under gs://bucket/folder
without the footer or the lines ending in a colon. Piping that to wc -l
will give you the line-count of the result.
gsutil ls gs://bucket/folder/** | wc -l
Solution 4
gsutil ls -lR gs://Floder1/Folder2/Folder3/** |tail -n 1
Solution 5
As someone that had 4.5M objects in a bucket, I used gsutil du gs://bucket/folder | wc -l
which took ~24 min
Admin
Updated on July 08, 2022Comments
-
Admin almost 2 years
Is there an option to count the number of files in bucket-folders?
Like:
gsutil ls -count -recursive gs://bucket/folder Result: 666 files
I just want an total number of files to compare the amount to the sync-folder on my server.
I don't get it in the manual.
-
Admin over 10 yearsGreat, thanks ... just a liddle bit slow for 4 mio files .. Is this Operation 1 Call or counted as numbers of bucket elements? ... could become expensive .. :-)
-
jterrace over 10 yearsIt does an object listing on the bucket, and pages through the results, I think 1000 at a time, so it will make N/1000 calls, where N is the number of objects you have. This is a class A operation per the pricing page.
-
Syed Mudabbir over 8 yearsHello just logged in to say thanks this helped. I was trying to use find but that was not supported so when searching for an alternative stumbled upon your answer. Its been a great help.
-
booleys1012 over 8 yearsthe gsutil solution works great in gsutil v 4.15, @jterrace, but only if there are no "subdirectories" in the bucket/path you are listing. If there are subdirectories, du will roll up the size of the files below that directory and print a line to stdout for that directory (making the file count incorrect). Sorry for the late update to an old question.
-
mobcdi almost 8 yearsWhile
gsutil ls -l
works is there a way in Windows (no tail or ws) to get a summary without needing to list the entire bucket contents -
dlamblin about 7 years
du
andls
aren't counting as much aswc -l
is. -
Yogesh Patil over 6 years@jterrace Great, thanks. It also includes directory as an object and adds to count. Can we somehow only consider files count excluding directories.
-
northtree over 5 yearsWhy use
**
not just*
? -
dlamblin over 5 years@northtree I think in this case it might be equivalent, but ** does work for multiple levels at once, so I think
/folder/**/*.js
would find all js files under any depth of directories after folder (except in folder itself) while/folder/*/*.js
would only work for js files within a directory in folder. -
REdim.Learning about 5 years@jterrace looks like du is giving file sizes, not counts!
-
jterrace about 5 years@REdim.Learning - yes, but it prints one per line, which is why I pipe to
wc -l
-
Miles Erickson over 4 years@mobcdi If you have Git for Windows, you have Git Bash. Use that.
-
nroose about 3 yearsClearly GCP is using this to get more money from us. They clearly know the size and count. It should be available in the API. We should not accept less.
-
Yevgen Safronov over 2 yearsthis is an underappreciated answer.
-
ingernet over 2 yearsThis is WAY faster than using gsutil if you aren't doing something programmatically and you just need the count, AND it doesn't dip into your Class A Operations quota.
-
Vishwas M.R about 2 yearsEspecially helpful when your bucket has more than a million objects and the total size exceeds a few GBs.
-
Jérémy about 2 yearsOf course, this only works if you want to count the amount of files in the entire bucket. You can't use this to check the amount of files in a specific folder inside the bucket.