How to retrieve the most recent file in cloud storage bucket?

12,654

Solution 1

Hello this still doesn't seems to exists, but there is a solution in this post: enter link description here

The command used is this one:

gsutil ls -l gs://[bucket-name]/ | sort -k 2

As it allow you to filter by date you can get the most recent result in the bucket and recuperating the last line using another pipe if you need.

Solution 2

gsutil ls -l gs://<bucket-name> | sort -k 2 | tail -n 2 | head -1 | cut -d ' ' -f 7

It will not work well if there is less then two objects in the bucket though

Share:
12,654
Chris Stryczynski
Author by

Chris Stryczynski

Software dev(op). Independent consultant available for hire! Checkout my GitChapter project on github!

Updated on June 19, 2022

Comments

  • Chris Stryczynski
    Chris Stryczynski about 2 years

    Is this something that can be done with gsutil?

    https://cloud.google.com/storage/docs/gsutil/commands/ls does not seem to mention any sorting functionality - only filtering by a date - which wouldn't work for my use case.

  • John Hanley
    John Hanley almost 3 years
    Read this link regarding sequentially naming objects: cloud.google.com/storage/docs/best-practices#naming Avoid using sequential object names such as timestamp-based object names if you are uploading many objects in parallel. Objects with sequential names are stored consecutively, so they are likely to hit the same backend server. When this happens, throughput is constrained. In order to achieve optimal throughput, add the hash of the sequence number as part of the object name to make it non-sequential.
  • Codemonkey
    Codemonkey almost 3 years
    I've been doing it this way for years with no issues... I have root folders 0001 0002 0003 0004 etc; each of those is limited to 75GB in size; when it fills, I move on to the next one. The filenames WITHIN the folders, are md5 hashes of the file contents, so maybe that's suitable given the wording above?
  • John Hanley
    John Hanley almost 3 years
    Cloud Storage does not have folders. What you think is a folder is just a prefix that is part of the object name. Buckets are a flat namespace. Unless you need optimum performance, this probably does not matter for you. For customers that require high performance for millions/billions of objects: Objects with sequential names are stored consecutively, so they are likely to hit the same backend server. I commented on your answer so that others do not copy your naming scheme without understanding the impact on performance.
  • Codemonkey
    Codemonkey almost 3 years
    I know that, but I'm using this as a backup of my server. I should have clarified that I meant that's my file structure on the server.
  • John Hanley
    John Hanley almost 3 years
    I am not trying to inform you. I commenting for future readers of your answer.