Get all regex matches between two patterns and print them to file
5,589
IF GNU grep is an option, you could pass the -P
(perl-compatible regex) flag and use lookahead assertions, lookbehind assertions and non-greedy matches to pull out what you need
echo 'xxSTART relevanttext xxEND something else xxSTART even more relevant' |\
grep -oP '(?<=START).*?(?=xxEND|$)'
relevanttext
even more relevant
Or as Stephane Chazelas suggests, use the nifty \K in place of the look-behind assertion
echo 'xxSTART relevanttext xxEND something else xxSTART even more relevant' |\
grep -oP 'START\K.*?(?=xxEND|$)'
Related videos on Youtube
Author by
user48020
Updated on September 18, 2022Comments
-
user48020 over 1 year
I've got a file with a bunch of long lines. I'd like to grab every group between two patterns and print them to a new file, one match per line. I could manage to do this with Python, but I'd prefer using just command line tools for this task. If there is no end pattern, I'd like to grab everything 'till the end of the line.
Something like:
input: xxSTART relevanttext xxEND something else xxSTART even more relevant output: relevanttext even more relevant
-
Admin over 10 yearsSo
START
andEND
both are within the same long line? -
Admin over 10 yearsYes! I used to have just one match per line, so I'd use
sed
to grab everything afterxxSTART
, but now the input data changed and I'm a bit stumped.
-
-
Stéphane Chazelas over 10 yearsOr:
grep -oP 'START\K.*?(?=xxEND|$)'
-
Mathias Begert over 10 years@StephaneChazelas, that's a good point, added in. My version of GNU grep (2.5.1) doesn't support
\K
though