Regex matching only numbers

11,388
sed -E 's/([0-9]+).*/\1/g'  <<< "$s" 

The above command means: find a sequence of number followed by something and replace it with only the numbers. So it matches 12345 67890testing and replaces it with only 12345.

The final string will be abcd 12345.

If you want to get only 12345 you should use grep.

egrep -o '[0-9]+ ' <<< "$s"

Or with sed you can use:

sed -E 's/[a-zA-Z ]*([0-9]+).*/\1/g'  <<< "$s"

This will drop the letters before the numbers

Share:
11,388
Admin
Author by

Admin

Updated on June 04, 2022

Comments

  • Admin
    Admin almost 2 years

    I am having problems understanding what my regex in bash shell is doing exactly.

    I have the string abcde 12345 67890testing. I want to extract 12345 from this string using sed.

    However, using sed -re 's/([0-9]+).*/\1/' on the given string will give me abcde 12345.

    Alternatively, using sed -re 's/([\d]+).*/\1/' would actually only extract abcd.

    Am I wrong in assuming that the expression [0-9] and [\d] ONLY capture digits? I have no idea how abcd is being captured yet the string 67890 is not. Plus, I want to know why the space is being captured in my first query?

    In addition, sed -re 's/^.*([0-9]+).*/\1/' gives me 0. In this instance, I completely do not understand what the regex is doing. I'd thought that the expression ^.*[0-9]+ would only capture the first instance of a string of only numbers? However, it's matching only the last 0.

    All in all, I'd like to understand how I am wrong about all these. And how the problem should be solved WITHOUT using [\s] in the regex to isolate the first string of numbers.

  • Admin
    Admin about 10 years
    Thank you for the reply. However, using sed -r 's/([0-9]+).*/\1/g' <<< "$s" will yield me abcd 12345 I am unsure as to how it's capturing abcd
  • anubhava
    anubhava about 10 years
    sed -r 's/([0-9]+).*/\1/g' <<< "$s" gives me 12345
  • chrylis -cautiouslyoptimistic-
    chrylis -cautiouslyoptimistic- about 10 years
    Can you please explain the idea behind using a here string? As far as I understand, that just runs the sed expression on the contents of the Bash variable s, which does not seem helpful.
  • anubhava
    anubhava about 10 years
    Let me ask first why do you think it is not helpful? For one using here-string prevents sub-shell creation.
  • Admin
    Admin about 10 years
    For my last attempt, is using the greedy expression .* eating into the rest of the string until it leaves only 0? It seems impossible to use greedy expressions to remove the preceding abcde part of the problem then?
  • drolando
    drolando about 10 years
    If you use .*([0-9]+).* it'll match only the last number because the + means 1 or more so it'll se the minimum. If you know exactly the length of the sequence of numbers you can use .*([0-9]{5}) .* There must be a space between the ) and the .