Regex file name with multiple digits

11,759

Solution 1

With ksh, bash -O extglob and zsh -o kshglob only:

test_file-+([[:digit:]])-master.tar.gz

In bash, you have to set the extglob option first. This +(...) matches one or more occurrences of the given patterns. [:digit:] when inside a [...] bracket expression is a POSIX defined character class which includes Indo-Arabic decimal digits ([[:digit:]] is the same as [0123456789] or [0-9]).

It will match:

test_file-1234-master.tar.gz
test_file-1-master.tar.gz
test_file-123456789-master.tar.gz

It will not match:

test_file-1b-master.tar.gz
test_file--master.tar.gz
test_file-a1-master.tar.gz
test_file-abcd-master.tar.gz
test_file-Ⅵ-master.tar.gz # roman numeral
test_file-٨-master.tar.gz  # Eastern Arabic decimal digit

The tar command in your question should then be done like this (with a loop):

shopt -s extglob # bash
# setopt kshglob # zsh
for f in test_file-+([[:digit:]])-master.tar.gz; do
    tar xf "$f"
done

The more idiomatic short syntax in zsh is:

setopt extendedglob
for f (test_file-[0-9]##-master.tar.gz) tar xf $f

(# being the extendedglob equivalent of regexp *, and ## or +).

Solution 2

You're trying to use extended regular expression metacharacters and POSIX character classes (+ and [:digit:] respectively) in a globbing pattern;

Assuming bash or similiar, "basic" globbing only supports a handful of wildcards such as:

  • ?: single charater
  • *: zero or more characters
  • []: character class
  • {}: list
  • [!]: negated character class

Differently from metacharacters in extended regular expressions, in "basic" globbing there's no way to modify a wildcard's behavior to match a fixed number of occurences;

So, using "basic" globbing, the shortest and tightest pattern would be:

tar -xf test_file-[0-9][0-9][0-9][0-9]-master.tar.gz
Share:
11,759

Related videos on Youtube

Siva
Author by

Siva

“Computer science education cannot make anybody an expert programmer any more than studying brushes and pigment can make somebody an expert painter.” - Eric Raymond

Updated on September 18, 2022

Comments

  • Siva
    Siva almost 2 years

    Say I have the following file:

    test_file-1234-master.tar.gz
    

    I have tried to un tar using the following commands

     tar -xf test_file-[0-9]+-master.tar.gz
     tar -xf test_file-[:digit]-master.tar.gz
    

    But no luck. How to match this pattern ?

    NOTE: There will be always one file. I'm not trying to open multiple files.

    • Alessio
      Alessio over 8 years
    • Alessio
      Alessio over 8 years
      to answer your question, you want: test_file-[0-9]*-master.tar.gz
    • chaos
      chaos over 8 years
      @cas This will match one digit followed by anything.
    • Alessio
      Alessio over 8 years
      yes, so it will. test_file-[0-9][0-9][0-9][0-9]-master.tar.gz then.