Extracting string via grep regex assertions

grep regular-expression string

12,656

Solution 1

Unless the string that you want to extract may itself contain ;, the simplest thing is probably to replace . (which matches any single character) with [^;] (which matches any character excluding ;)

$ printf '%s\n' "$my_string" | grep -oP '(?<='baz=')[^;]*'
222

With grep linked to libpcre 7.2 or newer, you can also simplify the lookbehind using the \K form:

$ printf '%s\n' "$my_string" | grep -oP 'baz=\K[^;]*'
222

Those will print all occurrences in the string and assume the matching text doesn't contain newline characters (since grep processes each line of input separately).

Solution 2

Steeldriver's answer is accurate, but I have a hard time with lookaheads/behinds and would do it like this for readability (with bash):

my_string="foo bar=1ab baz=222;"
regex='baz=([0-9]+);'
[[ $my_string =~ $regex ]] &&
  echo "${BASH_REMATCH[1]}"

12,656

Michael Grünstäudl

Updated on September 18, 2022

Comments

Michael Grünstäudl over 1 year
Assume a text string my_string
```
$ my_string="foo bar=1ab baz=222;"
```
I would like to extract the alphanumeric string between keyword baz and the semi-colon.

How do I have to modify the following grep code using regex assertions to also exclude the trailing semi-colon?
```
$ echo $my_string | grep -oP '(?<='baz=').*'
222;
```
- Stéphane Chazelas almost 7 years
  
  What should be the outcome for foo baz=x; bar=y;? (x or x; bar=y?). And for baz=x; baz=y;?