How to recursively find a .doc file that contains a specific word?
Solution 1
Use find
for recursive searches:
find -name '*.doc' -exec catdoc {} + | grep "specificword"
This will also output the file name:
find -name '*.doc' | while read -r file; do
catdoc "$file" | grep -H --label="$file" "specificword"
done
(Normally I would use find ... -print0 | while read -rd "" file
, but there's maybe a .0001% chance that it would be necessary, so I stopped caring.)
Solution 2
You might want to look at recoll which is a full-text search tool for Linux and Unix systems supporting many different document formats. However, it is index-based, i.e., it has to index the documents you want to search in before the actual search. (Thanks to pabouk for pointing this out).
There is a GUI and a command line, too.
See the documentation for further infos.
Solution 3
Grep should find binary matches with:
find /path/to/dir -name '*.doc' exec grep -l "specificword" {} \;
Related videos on Youtube
Tom
Updated on September 18, 2022Comments
-
Tom over 1 year
I'm using bash under Ubuntu.
Currently this works well for the current directory:
catdoc *.doc | grep "specificword"
But I have lots of subdirectories with .doc files.
How can I search for, let's say, "specificword" recursively?
-
Tom over 12 yearsThanks grawity, the first suggestion works quit well. Is there a way to print the file name? it's only printing the phrase in which it has been found.
-
user1686 over 12 years@user: Try the second suggestion, which, by the way, is titled "This will also output the file name".
-
glenn jackman over 12 yearscan probably simplify a bit:
find -name \*.doc -exec sh -c "catdoc '{}' | grep -q 'specificword' && echo {}" \;
-
pabouk - Ukraine stay strong over 10 yearsMaybe it is worth to note that Recoll provides indexed search. First it has to index the documents then it can search through the indexes.