make find search in alphabetical order

5,184

Solution 1

If you don't have too many total directories, you could force find to traverse the paths in order by giving them all at the command-line:

shopt -s globstar
find **/ -name '*.tex' -exec cat {} \; > blub.txt

(Using bash syntax for enabling the recursive ** glob). The glob expands in naturally sorted order, so the years would sort first, then each of the numerically-prefixed months would sort inside each year. The trailing slash asks bash to only return directories, letting find find the files.

Alternatively, if you have a list of years as subdirectories, you could loop over that:

for year in *
do
  find "$year"/* -name '*.tex' -exec cat {} \;
done > blub.txt

This expands the 12 months of each year into each loop, again forcing find to process them in order.

Solution 2

Something like this might work:

find -type f -name \*.tex -print0 | sort -z | xargs -0 cat >> blub.txt

The print0 option for find uses a NUL character to delimit found objects, which can be leveraged with sort -z to put them into lexical order; we then feed that ordered list into xargs.

Share:
5,184

Related videos on Youtube

miri queer
Author by

miri queer

Updated on September 18, 2022

Comments

  • miri queer
    miri queer almost 2 years

    I am currently using a short find command in order to search a number of directories (and sub-directories) for files ending in "tex" and then catting them into one coherent text. The command that I Use is this one:

    find . -name '*.tex' -exec cat {} \; > blub.txt

    However, the find command doesn't search the folders the way I would want them to do it. It jumps around a lot and instead of first grabbing the folder "2011" it begins with "2013" etc. Is there a way of amending that, so that it begins with 2011, and the subdirectories therein, i.e., with the folder "01-january", then "02-february" etc.

    • Jeff Schaller
      Jeff Schaller almost 6 years
      How many levels of directories do you care about ordering? Just the 2, year and month?
    • Jeff Schaller
      Jeff Schaller almost 6 years
      Does the shell matter here? bash and zsh have extended globbing facilities; are you in an environment that supports them?
  • Jeff Schaller
    Jeff Schaller almost 6 years
    With —null for xargs
  • Pankaj Goyal
    Pankaj Goyal almost 6 years
    sort is not returning null-delimited output though; see echo -e "b\0a" | sort -z.
  • Jeff Schaller
    Jeff Schaller almost 6 years
    interesting!? while testing, I ran find * -print0|sort -z|od -c and saw nulls between the filenames
  • Stephen Kitt
    Stephen Kitt almost 6 years
    @DopeGhoti Try it with three values: echo -e "b\0a\0c" | sort -z. The newline ends up being part of the value, so sort sorts b and a\n in your example, b, a, and c\n in my version. sort -z uses null-separated strings for input and output.
  • don_crissti
    don_crissti almost 6 years
    It's in the info page: '--zero-terminated’ Delimit items with a zero byte rather than a newline (ASCII LF). I.e., treat input as items separated by ASCII NUL and terminate output items with ASCII NUL.
  • Pankaj Goyal
    Pankaj Goyal almost 6 years
    If globstar already does the work for you, can you simply cat **/*.tex?
  • Jeff Schaller
    Jeff Schaller almost 6 years
    Excellent point! I "thought" myself into a corner of using find, but -- you're right!
  • Pankaj Goyal
    Pankaj Goyal almost 6 years
    I will never ever remember to check both info and manual pages. Sigh. Also, tested with find -print0 | sort -z | xargs -0 cat and everything was ducky!
  • Stephen Kitt
    Stephen Kitt almost 6 years
    @DopeGhoti on BSD? So that means sort -z works the same way whether it’s BSD or GNU ;-).