How do I perform xargs grep on grep output that has spaces?

38,162

Solution 1

Use something like this perhaps (if gnu grep).

grep -r 'content pattern' --include==*.cpp

man grep

--include=GLOB Search only files whose base name matches GLOB (using wildcard matching as described under --exclude)

Also see the options for null delimiters.

-Z, --null Output a zero byte (the ASCII NUL character) instead of the character that normally follows a file name. For example, grep -lZ outputs a zero byte after each file name instead of the usual newline. This option makes the output unambiguous, even in the presence of file names containing unusual characters like newlines. This option can be used with commands like find -print0, perl -0, sort -z, and xargs -0 to process arbitrary file names, even those that contain newline characters.

-z, --null-data Treat the input as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline. Like the -Z or --null option, this option can be used with commands like sort -z to process arbitrary file names.

Solution 2

If you have to jump through a lot of hoops, then the efficiency of xargs is lost anyway. Here is one crude work around:

find . -iname "*.cpp" | grep "<pattern>" | while read -r x; do grep exa "$x"; done

Every time I run into problems with spaces in file names, the answer is double quotes on a variable.

Solution 3

Use find to do all the filename filtering.  Rather than

find . -name "*.cpp" | grep "foo" | xargs grep …

do

find . -name "*.cpp" -name "*foo*" -print0 | xargs -0 grep …

If you want to do something slightly more complicated, like

find . -name "*.cpp" | egrep "foo|bar" | xargs grep …

you can do

find . -name "*.cpp" "(" -name "*foo*" -o -name "*bar*" ")" -print0 | xargs -0 grep …

Note that these should work even for files with newlines in their names.

And, if you need the power of full-blown regular expressions, you can use -regex.

Solution 4

This should work even without GNU tools:

#Find all C++ files that match a certain pattern and then search them
find . -name "*.cpp"  | grep "<name regex>" | perl -pe 's/\n/\0/' \
  | xargs -0 grep "<content regex>"

The perl call replaces line breaks with null characters, which will allow xargs -0 to interpret the input on a per-line basis rather than a per-whitespace basis.

Using GNU, you can remove the perl call and change xargs -0 … to xargs -d "\n" …

Don't have perl or GNU? Try awk '{printf "%s%c", $0, 0}' instead.

Share:
38,162

Related videos on Youtube

quanticle
Author by

quanticle

Updated on September 18, 2022

Comments

  • quanticle
    quanticle over 1 year

    I'm searching for files based on a regular expression, and then I'm trying to search those files for content. So, for example, I have something like

    #Find all C++ files that match a certain pattern and then search them
    find . -name "*.cpp" | grep "<name regex>" | xargs grep "<content regex>"
    

    The problem I'm running into is that some of the paths have spaces in them, which confuses xargs. I know that if I was just using find, I could use the -print0 argument (along with the -0 argument on xargs) to keep xargs from treating spaces as delimiters. Is there something similar with grep?

    Or am I approaching this problem in the wrong way entirely? Naively, find to grep to xargs grep makes sense to me, but I'm open to other approaches that yield the same results.

  • dhag
    dhag about 9 years
    This may not do the right thing if some of the file names include newlines (a rather unusual occurrence, for sure, but not impossible).
  • Adam Katz
    Adam Katz about 9 years
    @dhag has a valid point regarding xargs -d "\n". That's a very unusual occurrence, but if you don't have control of the data and are worried about it being a security risk, be careful about output expectations.
  • mikeserv
    mikeserv about 9 years
    Note that grep -r include='*.cpp' is a shell glob - and so is feature-aligned w/ find . -name '*.cpp' -exec grep -e 'content_pattern' -- {} \; not w/ find . -name '*.cpp' | grep 'name_pattern' | xargs grep 'content_pattern'
  • Adam Katz
    Adam Katz almost 6 years
    This runs the loop's inner grep uniquely for each line found by the outer grep. That's a lot of overhead.
  • Ganton
    Ganton about 2 years
    As usual, beware if e.g. some file names have any carriage return inside (How can I find and safely handle file names containing newlines, spaces or both?, grep -Z, etc.)