Regex for digits in Unix find command

16,321

Solution 1

This is what I have used in the past:

Year: (19|20)[0-9][0-9]
Month:  0[1-9]|1[012]
Day: (0[1-9]|[12][0-9]|3[01])

You can put these together in your regex. You will, ofcourse, have to escape the brackets and pipes.

Solution 2

\d is an extension of regular expressions that is not supported by Emacs regular expressions and POSIX regular expressions (those are the flavours find supports). You can use [[:digit:]] or [0-9] instead.

Solution 3

The following is ugly and does not weed out invalid dates, but might be close enough:

find reports/ -type f -regex ".*/reports/[0-9][0-9][0-9][0-9]/[0-9][0-9]/[0-9][0-9]/[0-9][0-9]\.\(csv\|sql\|txt\|xls\|zip\)"

Solution 4

You can use the repeaters like this:

find ./ -regextype posix-egrep -iregex ".*\._[0-9]{8}-[0-9]{6}.*"

I use this to find backups of the form:

./foo._20140716-121745.OLD

Where foo is the original name and the numbers are the date and time.

(on CentOS 6.5)

P.S. -regextype posix-extended works too.

Share:
16,321
Teflon Ted
Author by

Teflon Ted

Married to Java. Sleeping with Ruby.

Updated on June 08, 2022

Comments

  • Teflon Ted
    Teflon Ted almost 2 years

    I have this command:

    find reports/ -type f -mtime +90 -regex ".*\.\(csv\|sql\|txt\|xls\|zip\)"
    

    And I need to beef it up so the part before the file extensions matches a YYYY/MM/DD pattern, like so:

    reports/2010/10/10/23.txt
    reports/2010/10/10/23.xls
    reports/2010/10/10/26.csv
    reports/2010/10/10/26.sql
    reports/2010/10/10/26.txt
    reports/2010/10/10/26.xls
    reports/2010/10/10/27.csv
    

    But I'm failing to get any permutation of \d and parens escaping to work.

    UPDATE: here's what worked for me based on the accepted answer below:

    find reports/ -type f -mtime +90 -regex "reports/201[01]/\([1-9]\|1[012]\)/\([1-9]\|[12][0-9]\|3[01]\)/.*\.\(csv\|sql\|txt\|xls\|zip\)"
    
  • Teflon Ted
    Teflon Ted over 13 years
    That looks good (and I'll test it in a bit) but is it possible to tighten up the ranges with something like "[0-9]{4}" instead of repeating it four times in a row?
  • David J. Liszewski
    David J. Liszewski over 13 years
    The numeric quantifier "{4}" did not seem to work with the version of regexec in the version of libc used by find on my system (libc 2.3.4). YMMV.
  • zpea
    zpea over 6 years
    You can use [0-9], but if you can use [[:digit:]] depends on which -regextype you use. For example emacs (the default type) does not support it whereas posix-extended does. See GNU findutils manual: 8.5 Regular Expressions for the syntax descriptions linked on the bottom.