How do I perform xargs grep on grep output that has spaces?
Solution 1
Use something like this perhaps (if gnu grep).
grep -r 'content pattern' --include==*.cpp
man grep
--include=GLOB Search only files whose base name matches GLOB (using wildcard matching as described under --exclude)
Also see the options for null delimiters.
-Z, --null Output a zero byte (the ASCII NUL character) instead of the character that normally follows a file name. For example, grep -lZ outputs a zero byte after each file name instead of the usual newline. This option makes the output unambiguous, even in the presence of file names containing unusual characters like newlines. This option can be used with commands like find -print0, perl -0, sort -z, and xargs -0 to process arbitrary file names, even those that contain newline characters.
-z, --null-data Treat the input as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline. Like the -Z or --null option, this option can be used with commands like sort -z to process arbitrary file names.
Solution 2
If you have to jump through a lot of hoops, then the efficiency of xargs is lost anyway. Here is one crude work around:
find . -iname "*.cpp" | grep "<pattern>" | while read -r x; do grep exa "$x"; done
Every time I run into problems with spaces in file names, the answer is double quotes on a variable.
Solution 3
Use find
to do all the filename filtering. Rather than
find . -name "*.cpp" | grep "foo" | xargs grep …
do
find . -name "*.cpp" -name "*foo*" -print0 | xargs -0 grep …
If you want to do something slightly more complicated, like
find . -name "*.cpp" | egrep "foo|bar" | xargs grep …
you can do
find . -name "*.cpp" "(" -name "*foo*" -o -name "*bar*" ")" -print0 | xargs -0 grep …
Note that these should work even for files with newlines in their names.
And, if you need the power of full-blown regular expressions,
you can use -regex
.
Solution 4
This should work even without GNU tools:
#Find all C++ files that match a certain pattern and then search them
find . -name "*.cpp" | grep "<name regex>" | perl -pe 's/\n/\0/' \
| xargs -0 grep "<content regex>"
The perl
call replaces line breaks with null characters, which will allow xargs -0
to interpret the input on a per-line basis rather than a per-whitespace basis.
Using GNU, you can remove the perl
call and change xargs -0 …
to xargs -d "\n" …
Don't have perl
or GNU? Try awk '{printf "%s%c", $0, 0}'
instead.
Related videos on Youtube
quanticle
Updated on September 18, 2022Comments
-
quanticle over 1 year
I'm searching for files based on a regular expression, and then I'm trying to search those files for content. So, for example, I have something like
#Find all C++ files that match a certain pattern and then search them find . -name "*.cpp" | grep "<name regex>" | xargs grep "<content regex>"
The problem I'm running into is that some of the paths have spaces in them, which confuses
xargs
. I know that if I was just usingfind
, I could use the-print0
argument (along with the-0
argument onxargs
) to keep xargs from treating spaces as delimiters. Is there something similar withgrep
?Or am I approaching this problem in the wrong way entirely? Naively,
find
togrep
toxargs grep
makes sense to me, but I'm open to other approaches that yield the same results. -
dhag about 9 yearsThis may not do the right thing if some of the file names include newlines (a rather unusual occurrence, for sure, but not impossible).
-
Adam Katz about 9 years@dhag has a valid point regarding
xargs -d "\n"
. That's a very unusual occurrence, but if you don't have control of the data and are worried about it being a security risk, be careful about output expectations. -
mikeserv about 9 yearsNote that
grep -r include='*.cpp'
is a shell glob - and so is feature-aligned w/find . -name '*.cpp' -exec grep -e 'content_pattern' -- {} \;
not w/find . -name '*.cpp' | grep 'name_pattern' | xargs grep 'content_pattern'
-
Adam Katz almost 6 yearsThis runs the loop's inner grep uniquely for each line found by the outer grep. That's a lot of overhead.
-
Ganton about 2 yearsAs usual, beware if e.g. some file names have any carriage return inside (How can I find and safely handle file names containing newlines, spaces or both?,
grep -Z
, etc.)