Grep - how to output only the content of a capturing group
Solution 1
If you have either pcregrep
or pcre2grep
you can use the -o1
command-line flag to request that only capture group 1 is output. (Or change 1 to some other number if there are more captures in the regex.)
You can use the -oN
command more than once if you want to output more than one capture group.
As far as I know, grep -P
does not implement this extension. You'll find pcre2grep
in Debian/Ubuntu package pcre2-utils
. pcregrep
is in package pcregrep
.
Solution 2
This question was asked ten years ago, so I won't mark it as duplicate. Also I noticed no sed solution was given since OP asked an answer without:
sed -nr 's/(hello[0-9]+), please match me/\1/p' test.txt
-
-n
stands for quiet (won't print anything except if explicitly asked) -
-r
allows use of extented regular expressions (avoids here using\
before parenthesis) -
s/reg/repl/p
command means "if regexpreg
matches the current line, replace it by captured text byrepl
, and prints it (/p
)"
Solution 3
You can use ripgrep, which generally seems to be superior to grep, like this
rg '(hello[0-9]+), please match me' -or '$1' <file>
where ripgrep uses -o
or --only matching
and -r
or --replace
to output only the first capture group with $1
(quoted to be avoid intepretation as a variable by the shell).
Solution 4
grep
, sed
and awk
have ancient regular expression engines that don't support any modern regex features. I don't really think they're fit for purpose anymore.
One thing Perl
is still good for is as a replacement for those in pretty much all one-liners, as it has a very nice, modern regex engine, and a couple of handy command line switches, -ne
and -pe
.
The switches cause Perl to automatically apply your expression to each line of the input and either unconditionally print the result, or let you control printing of the result.
For instance, to print the first hello
followed by a digit (hello\d
) for all lines that have hello\d
followed by please match me
, you can do:
perl -ne 'm/(hello\d) please match me/ && print "$1\n"' <file>
There are many nice sites out there that list common tasks you can do with a Perl one-liner, such as this one.
I also think that ripgrep should be in everyone's toolbox.
Solution 5
Just an awk
version.
awk -F, '/hello[0-9]+, please match me/ {print $1}' file
hello1

Comments
-
Sami 12 months
I am trying to find a way for grep to output only the content of a capturing group. For instance, if I have the following file:
hello1, please match me hello2, please do not match me
I would like
grep -Eo '(hello[0-9]+), please match me' file
To output
hello1
. However it outputshello1, please match me
.Now, I know that
grep -Po 'hello[0-9]+(?=, please match me)'
will do the trick, but I'm thinking there must be a way to simply return a capturing group, but I couldn't find any info (on the net and inman grep
).Is it possible, or are capturing groups only meant to be backrefenced ? It would seem weird to me if there was no way of doing that.
Thank you for your time, and feel free to critique the way this post is constructed!