Matching exact number of digits in a string

10,867

Yes, in (please quote the regex or it will be interpreted as a shell glob):

grep -Eq '[0-9]{7}' <<< "1505082"

grep is matching the 7 digits. You can see what is being matched by replacing the q with an o:

$ grep -Eo '[0-9]{7}' <<< "1505082"; echo "$?"
1505082
0

And yes, it will also match this:

$ grep -Eq '[0-9]{7}' <<< "150508299999"; echo "$?"
1505082
0

It removed all the nines.
The problem is that you are making a not-anchored match and it will match a 7 digit (or more) number.

You can anchor with:

$ grep -Eq '^[0-9]{7}$' <<< "15050829999"; echo "$?"
1

To match a 7 digit number anywhere and followed or preceded by non-digits you need a completely different anchor:

$ grep -oP '(?<=^|[^0-9])[0-9]{7}(?=[^0-9]|$)' <<< "1505082"; echo $?
1505082
0

$ grep -oP '(?<=^|[^0-9])[0-9]{7}(?=[^0-9]|$)' <<< "1505082_CSE_322"; echo $?
1505082
0

$ grep -oP '(?<=^|[^0-9])[0-9]{7}(?=[^0-9]|$)' <<< "1505082999_CSE_322"; echo $?
0

Those are lookahead matches, one is a look-back:

(?<=^|[^0-9])

That match either the start of the string (^) or a non-digit. The other is a lookahead:

(?=[^0-9]|$)

which match either a non-digit or the end of the string.


The only other alternative with simpler Extended regex is to extract any run of 7 (or more) digits and then confirm that it is exactly 7 digits:

$ echo "150508299_CSE_322" | 
          grep -oE '[0-9]{7,}' | 
                  grep -qE '^[0-9]{7}$'; echo "$?"
1
Share:
10,867

Related videos on Youtube

Robur_131
Author by

Robur_131

Updated on September 18, 2022

Comments

  • Robur_131
    Robur_131 over 1 year

    Given a string, how can I find out that it contains exactly 7 consecutive digits number?

    Such as 1505082 or 1505082_CSE_322 but not 15050821 or 15050821_CSE_322

    I've tried

    grep -Eq [0-9]{7} <<< "1505082" 
    

    which returns 0 but

    grep -Eq [0-9]{7} <<< "15050821"` 
    

    also returns 0.

    What I'm doing wrong?

  • Robur_131
    Robur_131 over 5 years
    Could you tell me why grep -oP '(?<=^|[^0-9])[0-9]{7}(?=[^0-9]|$)' <<< "1505082_CSE_322" | wc -m is giving the output 8, shouldn't it give 7 as output?
  • done
    done over 5 years
    For the same reason that echo | wc -m gives you 1, it counts the newline character. Try printf '\n\n\t\t' | wc -m for example. @Robur_131
  • Jeff Schaller
    Jeff Schaller about 4 years
    This would falsely match 15050821, for example.