Extracting string via grep regex assertions

12,656

Solution 1

Unless the string that you want to extract may itself contain ;, the simplest thing is probably to replace . (which matches any single character) with [^;] (which matches any character excluding ;)

$ printf '%s\n' "$my_string" | grep -oP '(?<='baz=')[^;]*'
222

With grep linked to libpcre 7.2 or newer, you can also simplify the lookbehind using the \K form:

$ printf '%s\n' "$my_string" | grep -oP 'baz=\K[^;]*'
222

Those will print all occurrences in the string and assume the matching text doesn't contain newline characters (since grep processes each line of input separately).

Solution 2

Steeldriver's answer is accurate, but I have a hard time with lookaheads/behinds and would do it like this for readability (with bash):

my_string="foo bar=1ab baz=222;"
regex='baz=([0-9]+);'
[[ $my_string =~ $regex ]] &&
  echo "${BASH_REMATCH[1]}"
Share:
12,656

Related videos on Youtube

Michael Grünstäudl
Author by

Michael Grünstäudl

Updated on September 18, 2022

Comments

  • Michael Grünstäudl
    Michael Grünstäudl over 1 year

    Assume a text string my_string

    $ my_string="foo bar=1ab baz=222;"
    

    I would like to extract the alphanumeric string between keyword baz and the semi-colon.

    How do I have to modify the following grep code using regex assertions to also exclude the trailing semi-colon?

    $ echo $my_string | grep -oP '(?<='baz=').*'
    222;
    
    • Stéphane Chazelas
      Stéphane Chazelas almost 7 years
      What should be the outcome for foo baz=x; bar=y;? (x or x; bar=y?). And for baz=x; baz=y;?