Name of first file in directory obtained in optimal way

5,997

This is tricky. Two approaches:


Approach 1; find:

find . -mindepth 1 -print -quit

find and -prints the first file found, and -quits immediately. -mindepth 1 would prevent matching the . hardlink of the current directory.

If you are interested in regular files only, add -type f:

find . -type f  -print -quit

-mindepth 1 can be dropped then as the . being a directory would not be matched.


Approach 2; sh, stdbuf, and awk:

Note that, this might suffer from ARG_MAX being triggered for too many files (argument list becoming too long, over ARG_MAX bytes). In that case, use approach 1.

  • any shell builin (e.g. printf, echo) to print the filename
  • shell globbing, *, to do the expansion (the collation order should be the same as ls for a given locale's LC_COLLATE)
  • stdbuf -o0 (stdbuf comes with GNU coreutils) to make the STDOUT stream of printf/echo unbuffered
  • pipe (|) the STDOUT of printf/echo to awk and exit after printing the first record
  • After awk exits, stdbuf (printf) would receive SIGPIPE, and would be killed
  • I would use printf to get the filenames separated by ASCII NUL (\0), and use \0 as the record separator in awk to tackle any edge cases as far as the filenames are concerned

Putting these together:

stdbuf -o0 printf '%s\0' * | awk 'BEGIN{RS="\0"} {print;  exit}'
Share:
5,997

Related videos on Youtube

Daniel
Author by

Daniel

I write blogs in English https://preciselab.io and Polish https://gustawdaniel.com

Updated on September 18, 2022

Comments

  • Daniel
    Daniel over 1 year

    Ordinary when I want to show name of first file from directory I type:

    ls raw/all | head -n 1
    

    But it takes long time when in directory there any many files


    Eg for dir with near to 900 k files we have following measurements:

    time ls raw/all | head -n 1 
    
    real    0m17.250s | 0m10.328s | 0m6.334s
    user    0m3.224s  | 0m3.884s  | 0m3.192s
    sys     0m0.544s  | 0m0.664s  | 0m0.572s
    

    while loop over all files takes:

    time ls raw/all | wc -l
    
    real    0m6.455s | 0m5.869s  | 0m5.228s
    user    0m3.612s | 0m3.468s  | 0m4.072s
    sys     0m0.460s | 0m0.784s  | 0m0.624s
    

    How print name of first file in efficient way?

  • Daniel
    Daniel over 6 years
    Looks awesome but: stdbuf -o0 printf '%s\0' raw/all/* | awk 'BEGIN{RS="\0"} {print; exit}' first froozen my terminal, but on catalog with 200k files prints: bash: /usr/bin/stdbuf: List of arguments too long. So if it is impossible to get first file due to sorting, we can change question: "How to get any filename, not necessary first?"
  • heemayl
    heemayl over 6 years
    @Daniel Ahhh, ARG_MAX is being triggered. Does find . -mindepth 1 -print -quit work? Prepend stdbuf -o0 as well...
  • Daniel
    Daniel over 6 years
    Yes. Please update your answer and explain why We want to have unbuffored stream? It works with and without stdbuf -o0.
  • heemayl
    heemayl over 6 years
    @Daniel Updated. stdbuf -o0 is not needed here because find is quitting immediately, no chance of storing in buffer and flushing later.