can regex capture groups be used in GNU find command?

5,346

Solution 1

An alternative to l0b0's fine answer (shorter, but potentially slightly less efficient):

Assuming a (recent) GNU sed:

find pool -print0 |
  sed -znE 's|.*/mypackage-([[:alnum:].]+-[0-9]{1,2})-x86_64\.pkg\.tar\.xz$|\1|p'|
  tr '\0' '\n'

Note the expensive part of find is the walking down the tree which it will have to do anyway whether you have -regex or not. So here, we're doing the matching and reporting in sed instead.

Solution 2

If you use

find pool -regextype posix-extended \
    -regex ".*/mypackage-([a-zA-Z0-9.]+-[0-9]{1,2})-x86_64\.pkg\.tar\.xz" \
    -printf '%f\n' |
  grep -Eo '[a-zA-Z0-9.]+-[0-9]{1,2}'

(assuming GNU grep as well), it should work for any path. The regex doesn't allow for any newlines, so there's no way to make it match for example a directory containing a similar name.

Share:
5,346

Related videos on Youtube

starfry
Author by

starfry

Updated on September 18, 2022

Comments

  • starfry
    starfry almost 2 years

    With the GNU find command (GNU findutils 4.4.2), a regular expression can be used to search for files. For example:

    $ find pool -regextype posix-extended -regex ".*/mypackage-([a-zA-Z0-9.]+-[0-9]{1,2})-x86_64.pkg.tar.xz+"
    

    Is it possible to extract the capture group defined by that expression and use it in a -printf argument?

    So, given a found file called pool/mypackage-1.4.9-1-x86_64.pkg.tar.xz, I would like to include the 1.4.9-1 part in a printf expression.

    Is this possible?

    • Stéphane Chazelas
      Stéphane Chazelas about 10 years
      No, but you can use -print0 and pipe to GNU sed -Ez (possibly followed by tr '\0' '\n')
    • Stéphane Chazelas
      Stéphane Chazelas about 10 years
      Note that [a-zA-Z] only makes sense in the C/POSIX locale.
    • starfry
      starfry about 10 years
      I expected the simple answer to be No given copious amounts of searching and reading man pages before asking the question. The answers provide interesting alternaive ways to achieve the desired output.
  • l0b0
    l0b0 about 10 years
    Yep, just thought it was nice to reuse the regex exactly