How to search in specific files matching a pattern

18,403

Solution 1

If you want to see the file name and line number, POSIXly:

find . -name 'CMake*' -type f -exec grep -nF /dev/null version {} +

(you don't want to use ; here which would run one grep per file). That's the standard equivalent of the GNUism:

find . -name 'CMake*' -type f -print0 | xargs -r0 grep -nHF {} +

find (in the first), xargs (in the second) will pass as many arguments to grep as possible without exceeding the limit on the number of arguments you can pass to a command. When doing that splitting, it could happen that for the last run, only one argument be passed to grep in which case grep wouldn't print the file name. That's why you need /dev/null there (or -H with GNU grep).

With -type f, we're only considering regular files (not devices, symlinks, pipes, directories...).

If you want to use GNUisms, you could use GNU grep ability to descend a directory tree:

 grep -rHnF --include='CMake*' version .

You don't want to use -R as that causes grep to follow symlinks when descending the directory tree and read from devices, fifos, sockets...

That version is safer and more efficient, but not portable.

Solution 2

Use find to select the files then grep to search in the files:

find . -name "CMake*" -print0 | xargs -0 grep -F version

by using xargs, you do not have grep started for each and every file found.

Solution 3

Just to show that you can do this without having to enlist the help of find. You can make use of grep's ability to recurse the filesystem too. Assuming your version of grep has the -R switch:

$ grep -R version | awk -F: '/CMake.*:/ {print $1}'

Example

I created some sample data which included files named CMake1-3 which contained the string version, as well as a file named CMake0 which did not. I also created 2 files named afile that also contained the string version.

$ tree .
.
|-- CMake1
|-- dir1
|   |-- dirA1
|   |   `-- CMake1
|   |-- dirA2
|   `-- dirA3
|-- dir2
|   |-- dirA1
|   |   `-- CMake0
|   |-- dirA2
|   |   |-- afile
|   |   `-- CMake2
|   `-- dirA3
`-- dir3
    |-- dirA1
    |-- dirA2
    |   `-- afile
    `-- dirA3
        `-- CMake3

Now when I run the above command:

$ grep -R version | awk -F: '/CMake.*:/ {print $1}'
CMake1
dir2/dirA2/CMake2
dir3/dirA3/CMake3
dir1/dirA1/CMake1

Details

The above commands produces a list like this from grep:

$ grep -R version 
CMake1:version
dir2/dirA2/CMake2:version
dir2/dirA2/afile:version
dir3/dirA2/afile:version
dir3/dirA3/CMake3:version
dir1/dirA1/CMake1:version

And awk is used to find all the strings that contain CMake.*: and to split these strings on a colon (:), and to return just the first field from this split, i.e., the path of the name of a corresponding CMake* file.

2 grep's + PCRE

More modern versions of grep will often include what's called PCRE - Perl Compatible Regular Exprssions. So you could use 2 grep commands, with the 2nd one using PCRE to extract just the path portion of the file, omitting the trailing :version bit of the string from the 1st grep.

Example

$ grep -R version | grep -Po '.*CMake.*(?=:version)'
CMake1
dir2/dirA2/CMake2
dir3/dirA3/CMake3
dir1/dirA1/CMake1

The -o will return just the matching portion, while -P is what enables PCRE. Within the regular expression we're using a lookahead, (?=...), at the end to select only strings that have a trailing :version. This lookahead is only to help align our pattern, it isn't included in the results we return, nor is it part of the actual pattern we're grepping for, i.e. CMake.*.

Line numbers

You can also include the switch -n to the first grep so that the line number where the string version was encountered can also be included in the output.

Example

$ grep -Rn version | grep -Po '.*CMake.*(?=:version)'
CMake1:1
dir2/dirA2/CMake2:9
dir3/dirA3/CMake3:1
dir1/dirA1/CMake1:1

To make the first example work you'd need to change the awk commands slightly:

$ grep -Rn version | awk -F: '/CMake.*:/ {print $1":"$2}'  
CMake1:1
dir2/dirA2/CMake2:9
dir3/dirA3/CMake3:1
dir1/dirA1/CMake1:1

The first example gives you the opportunity though to move the line number around, since we've already parsed it into a separate field via awk.

Example

Here we can put the number first.

$ grep -Rn version | awk -F: '/CMake.*:/ {print $2":"$1}'  
1:CMake1
9:dir2/dirA2/CMake2
1:dir3/dirA3/CMake3
1:dir1/dirA1/CMake1

Solution 4

find . -name "CMake*" -exec grep -H version {} \;

Solution 5

Use a combination of find an grep:

find . -name "Cmake*" -exec grep version {} \;

The example searches recursively from current directory and executes grep on the matching files.

Share:
18,403

Related videos on Youtube

B Faley
Author by

B Faley

Updated on September 18, 2022

Comments

  • B Faley
    B Faley over 1 year

    How can I find a word in specific files matching a pattern. e.g. searching for version in CMake* files recursively found in the current directory.

  • B Faley
    B Faley over 10 years
    Thank you. Is it possible to have the file name and line number in which the word was found in the search result as well?
  • slm
    slm over 10 years
    @Meysam - sure, you can turn on the switch -n which will include the line number. The examples I've provided should still work with this options turned on. grep -Rn version | ....