Recursively search a pattern/text only in the specified file name of a directory?
Solution 1
In the parent directory, you could use find
and then run grep
on only those files:
find . -type f -iname "file.txt" -exec grep -Hi "pattern" '{}' +
Solution 2
You could also use globstar.
Building grep
commands with find
, as in Zanna's answer, is a highly robust, versatile, and portable way to do this (see also sudodus's answer). And muru has posted an excellent approach of using grep
's --include
option. But if you want to use just the grep
command and your shell, there is another way to do it -- you can make the shell itself perform the necessary recursion:
shopt -s globstar # you can skip this if you already have globstar turned on
grep -H 'pattern' **/file.txt
The -H
flag makes grep
show the filename even if only one matching file is found. You can pass the -a
, -i
, and -n
flags (from your example) to grep
as well, if that's what you need. But don't pass -r
or -R
when using this method. It is the shell that recurses directories in expanding the glob pattern containing **
, and not grep
.
These instructions are specific to the Bash shell. Bash is the default user shell in Ubuntu (and most other GNU/Linux operating systems), so if you're on Ubuntu and don't know what your shell is, it's almost certainly Bash. Although popular shells usually support directory-traversing **
globs, they don't always work the same way. For more information, see Stéphane Chazelas's excellent answer to The result of ls * , ls ** and ls *** on Unix.SE.
How It Works
Turning on the globstar bash shell option makes **
match paths containing the directory separator (/
). It is thus a directory-recursing glob. Specifically, as man bash
explains:
When the globstar shell option is enabled, and * is used in a pathname expansion context, two adjacent *s used as a single pattern will match all files and zero or more directories and subdirectories. If followed by a /, two adjacent *s will match only directories and subdirectories.
You should be careful with this, since you can run commands that modify or delete far more files than you intend, especially if you write **
when you meant to write *
. (It's safe in this command, which doesn't change any iles.) shopt -u globstar
turns the globstar shell option back off.
There are a few practical differences between globstar and find
.
find
is far more versatile than globstar. Anything you can do with globstar, you can do with the find
command too. I like globstar, and sometimes it's more convenient, but globstar is not a general alternative to find
.
The method above does not look inside directories whose names start with a .
. Sometimes you don't want to recurse such folders, but sometimes you do.
As with an ordinary glob, the shell builds a list of all matching paths and passes them as arguments to your command (grep
) in place of the glob itself. If you have so many files called file.txt
that the resulting command would be too long for the system to execute, then the method above will fail. In practice you'd need (at least) thousands of such files, but it could happen.
The methods that use find
are not subject to this restriction, because:
-
Zanna's way builds and runs a
grep
command with potentially many path arguments. But if more files are found than can be listed in a single path, the+
-terminated-exec
action runs the command with some of the paths, then runs it again with some more paths, and so forth. In the case ofgrep
ing for a string in multiple files, this produces the correct behavior.Like the globstar method covered here, this prints all matching lines, with paths prepended to each.
-
sudodus's way runs
grep
separately for eachfile.txt
found. If there are many files, it might be slower than some other methods, but it works.That method finds files and prints their paths, followed by matching lines if any. This is a different output format from the format produced by my method, Zanna's, and muru's.
Getting color with find
One of the immediate benefits of using globstar is, by default on Ubuntu, grep
will produce colorized output. But you can easily get this with find
, too.
User accounts in Ubuntu are created with an alias that makes grep
really run grep --color=auto
(run alias grep
to see). It's a good thing that aliases are pretty much only expanded when you issue them interactively, but it means that if you want find
to invoke grep
with the --color
flag, you'll have to write it explicitly. For example:
find . -name file.txt -exec grep --color=auto -H 'pattern' {} +
Solution 3
You don't need find
for this; grep
can handle this perfectly fine on its own:
grep "pattern" . -airn --include="file.txt"
From man grep
:
--exclude=GLOB
Skip files whose base name matches GLOB (using wildcard
matching). A file-name glob can use *, ?, and [...] as
wildcards, and \ to quote a wildcard or backslash character
literally.
--exclude-from=FILE
Skip files whose base name matches any of the file-name globs
read from FILE (using wildcard matching as described under
--exclude).
--exclude-dir=DIR
Exclude directories matching the pattern DIR from recursive
searches.
--include=GLOB
Search only files whose base name matches GLOB (using wildcard
matching as described under --exclude).
Solution 4
The method given in muru's answer, of running grep
with the --include
flag to specify a filename, is often the best choice. However, this can also be done with find
.
The approach in this answer uses find
to run grep
separately for each file found, and prints the path to each file exactly once, above the matching lines found in each file. (Methods that print the path in front of every matching line are covered in other answers.)
You can change directory to the top of the directory tree where you have those files. Then run:
find . -name "file.txt" -type f -exec echo "##### {}:" \; -exec grep -i "pattern" {} \;
That prints the path (relative to the current directory, .
, and including the filename itself) of each file named file.txt
, followed by all matching lines in the file. This works because {}
is a placeholder for the file found. Each file's path is set apart from its contents by being prefixed with #####
, and is printed only once, before the matching lines from that file. (Files called file.txt
that contain no matches still have their paths printed.) You might find this output less cluttered than what you get from methods that print a path at the beginning of every matching line.
Using find
like this will almost always be faster than running grep
on every file (grep -arin "pattern" *
), because find
searches for the files with the correct name and skips all other files.
Ubuntu uses GNU find, which always expands {}
even when it appears in a larger string, like ##### {}:
. If you need your command to work with find
on systems that might not support this, or you prefer to use the -exec
action only when absolutely necessary, you can use:
find . -name "file.txt" -type f -printf '##### %p:\n' -exec grep -i "pattern" {} \;
To make the output easier to read, you can use ANSI escape sequences to get coloured file names. This makes each file's path heading stand out better from the matching lines that get printed under it:
find . -name file.txt -printf $'\e[32m%p:\e[0m\n' -exec grep -i "pattern" {} \;
That causes your shell to turn the escape code for green into the actual escape sequence that produces green in a terminal, and to do the same thing with the escape code for normal colour. These escapes are passed to find
, which uses them when it prints a filename. ($'
'
quotation is necessary here because find
's -printf
action doesn't recognize \e
for interpreting ANSI escape codes.)
If you prefer, you could instead use -exec
with the system's printf
command (which does support \e
). So another way to do the same thing is:
find . -name file.txt -exec printf '\e[32m%s:\e[0m\n' {} \; -exec grep -i "pattern" {} \;
Related videos on Youtube
Rajesh Keladimath
Updated on September 18, 2022Comments
-
Rajesh Keladimath over 1 year
I have a directory (e.g.,
abc/def/efg
) with many sub-directories (e.g.,:abc/def/efg/(1..300)
). All of these sub-directories have a common file (e.g.,file.txt
). I want to search a string only in thisfile.txt
excluding other files. How can I do this?I used
grep -arin "pattern" *
, but it is very slow if we have many sub-directories and files.-
Eliah Kagan over 7 yearsRelated (on Unix & Linux): find and echo file names only with pattern found
-
-
Eliah Kagan over 7 yearsI suggest also passing
-H
togrep
so that, in cases when only one path is passed to it, that path is still printed (rather than just the matching lines from the file). -
kcdtv over 7 yearsi was going to make a "for loop" with an array and I didn't think about exec native option from find. Good one! But I think that using dot will locate you in the directory where you already are. Correct me if I am wrong. Wouldn't it be better to specify the directly to parse in the find order?
find abc/def/efg -name "file.txt" -type f -exec echo -e "##### {}:" \; -exec grep -i "pattern" {} \;
-
sudodus over 7 yearsSure, that will eliminate the
cd abc/def/efg
'change directory' command :-) -
Eliah Kagan over 7 yearsNice--this seems like the best way. Simple and efficient. I wish I had known about (or thought to check the manpage for) this method. Thanks!
-
muru over 7 years@EliahKagan I'm more surprised Zanna didn't post this - I had shown an example of this option for another answer some time ago. :)
-
Zanna over 7 yearsslow learner, alas, but I get there eventually, your teachings aren't completely wasted on me ;)
-
G-Man Says 'Reinstate Monica' over 7 years(1) Why are you specifying the
-e
option toecho
? That will cause it to mangle any filenames that contain backslashes. (2) Using{}
as part of an argument is not guaranteed to work. It would be better to say-exec echo "#####" {} \;
or-exec printf "##### %s:\n" {} \;
. (3) Why not just use-print
or-printf
? (4) Consider alsogrep -H
. -
sudodus over 7 years@ G-man, 1)Because I used ANSI colour originally:
find . -name "file.txt" -type f -exec echo -e "\0033[32m{}:\0033[0m" \; -exec grep -i "pattern" {} \;
2) You may be right, but so far this is working for me. 3) -print and -printf are also alternatives. 4) This is already there in the main answer. - Anyway, you are welcome with your own answer :-) -
Rajesh Keladimath over 7 yearsThis is very simple and easy to remember. Thank You.
-
sudodus over 7 yearsI agree, that this is the best answer. Should I remove my answer to decrease confusion, or let it stay to show that there are alternatives, and what can be done with
find?
-
muru over 7 years@sudodus I don't see any reason for deleting your answer - it's not wrong or harmful or anything bad. It is informative, so keep it.
-
terdon over 7 yearsYou don't need the two
-exec
calls. Just usegrep -H
and that will print the file name (in color) as well as the matched text. -
sudodus over 7 yearsI know, read the first four lines of my answer (the 'Edit' section)!
-
Stig Hemmer over 7 yearsYou might want to state more clearly that you need to be using the
bash
shell for this to work. You do say it implicitly in "the globstar bash shell option" but it can be easily missed by people reading too quickly. -
sudodus over 7 yearsI removed my answer because it caused a lot of critical comments. So you should remove the reference to it in your answer.
-
Eliah Kagan over 7 years@StigHemmer Thanks -- I've clarified that not all shells have this feature. Although many shells (not just bash) do support directory-traversing
**
globs, your core critique is correct: the presentation of**
in this answer is specific to bash, with shopt being bash only and the term "globstar" being (I think) bash and tcsh only. I'd glossed over this originally because of those complexities, but you're right that it's somewhat confusing. Rather than discuss it at length in this answer, I've linked to another (quite thorough) post that does the heavy lifting. -
Eliah Kagan over 7 years@sudodus I've done so, but I hope this is temporary. I, and others, have found your answer valuable. It's true
-e
shouldn't be applied to paths, but this is easily fixed. For the first command, just omit-e
. For the second, usefind . -name file.txt -printf $'\e[32m%p:\e[0m\n' -exec grep -i "pattern" {} \;
orfind . -name file.txt -exec printf '\e[32m%s:\e[0m\n' {} \; -exec grep -i "pattern" {} \;
. Users will sometimes prefer your way (with-e
usage fixed) to the others, which print one path per matching line; yours prints one path per file found followed bygrep
results. -
Eliah Kagan over 7 years@sudodus So
grep
itself won't do what you're doing. Some other criticisms were wrong too.grep -H
run by-exec
won't colorize without--color
(orGREP_COLOR
). IEEE 1003.1-2008 doesn't guarantee{}
expands in##### {}:
, but Ubuntu has GNU find, which does. If it's OK with you I'll edit your post to fix the-e
bug (and clarify its use case) and you can see if you want to undelete. (I have the rep to view/edit deleted posts.) -
sudodus over 7 yearsOK, go ahead :-)
-
Eliah Kagan over 7 years@sudodus I've finally edited your answer. Since the post will remain deleted until you choose to undelete it, I went ahead and made significant changes, with the hope of showing the method really is valuable and distinct from others. You should definitely feel free to apply your own edits if you want to say all or part of it differently, take anything out, put more in, etc. (You can also, if you prefer, view my edit as just an example of a possible edit, roll it back, and start anew. My efforts still won't have been wasted. And everything can still be retrieved from the post's edit history.)
-
Eliah Kagan over 7 years@sudodus No problem! I'm glad this is back -- it's good to have an answer that covers this method.
-
Zanna over 7 yearsLooks great now :D
-
muru over 7 years@EliahKagan woah, thanks! But why bounty this post? It was getting decent attention. :) Now I have to find another post to bounty on.
-
user867560 over 6 years@ muru: very cool! +1
-
rsmets over 3 yearsGeneralized as the shell function:
findAndGrep() { find . -type f -iname "$1" -exec grep -Hi "$2" '{}' + }