How to match once per file in grep?
Solution 1
I think you can just do something like
grep -ri -m1 --include '*.coffee' 're' . | head -n 2
to e.g. pick the first match from each file, and pick at most two matches total.
Note that this requires your grep
to treat -m
as a per-file match limit; GNU grep
does do this, but BSD grep
apparently treats it as a global match limit.
Solution 2
So, using grep
, you just need the option -l, --files-with-matches
.
All those answers about find
, awk
or shell scripts are away from the question.
Solution 3
I would do this in awk
instead.
find . -name \*.coffee -exec awk '/re/ {print FILENAME ":" $0;exit}' {} \;
If you didn't need to recurse, you could just do it with awk:
awk '/re/ {print FILENAME ":" $0;nextfile}' *.coffee
Or, if you're using a current enough bash, you can use globstar:
shopt -s globstar
awk '/re/ {print FILENAME ":" $0;nextfile}' **/*.coffee
Solution 4
using find and xargs. find every .coffee files and excute -m1 grep to each of them
find . -print0 -name '*.coffee'|xargs -0 grep -m1 -ri 're'
test without -m1
linux# find . -name '*.txt'|xargs grep -ri 'oyss'
./test1.txt:oyss
./test1.txt:oyss1
./test1.txt:oyss2
./test2.txt:oyss1
./test2.txt:oyss2
./test2.txt:oyss3
add -m1
linux# find . -name '*.txt'|xargs grep -m1 -ri 'oyss'
./test1.txt:oyss
./test2.txt:oyss1
Solution 5
find . -name \*.coffee -exec grep -m1 -i 're' {} \;
find's -exec option runs the command once for each matched file (unless you use +
instead of \;
, which makes it act like xargs).
pathikrit
Experienced in developing scalable solutions for complex problems. I enjoy working full-stack - from architecting schema and data-flows, implementing algorithms, designing APIs to crafting innovative UIs. My professional interests include algorithms, functional programming, finance, data analytics and visualization.
Updated on July 05, 2022Comments
-
pathikrit almost 2 years
Is there any grep option that let's me control total number of matches but stops at first match on each file?
Example:
If I do this
grep -ri --include '*.coffee' 're' .
I get this:./app.coffee:express = require 'express' ./app.coffee:passport = require 'passport' ./app.coffee:BrowserIDStrategy = require('passport-browserid').Strategy ./app.coffee:app = express() ./config.coffee: session_secret: 'nyan cat'
And if I do
grep -ri -m2 --include '*.coffee' 're' .
, I get this:./app.coffee:config = require './config' ./app.coffee:passport = require 'passport'
But, what I really want is this output:
./app.coffee:express = require 'express' ./config.coffee: session_secret: 'nyan cat'
Doing
-m1
does not work as I get this forgrep -ri -m1 --include '*.coffee' 're' .
./app.coffee:express = require 'express'
Tried not using grep e.g. this
find . -name '*.coffee' -exec awk '/re/ {print;exit}' {} \;
produced:config = require './config' session_secret: 'nyan cat'
UPDATE: As noted below the GNU grep
-m
option treats counts per file whereas-m
for BSD grep treats it as global match count -
pathikrit over 11 years
-m1
stops at first match globally for me. In any case, if there are millions of matches and I only want 100 of them then this is inefficient as the grep would still go for the first million matches before piping result intohead
-
nneonneo over 11 years
head
stops reading input after the first hundred lines, andgrep
streams them match-by-match. Afterhead
stops reading input,grep
will stop finding matches. -
pathikrit over 11 yearsI just tried and it just prints the first result for me with or without the
| head -n 2
part. If I change option to-m2
I see 2 results. -
nneonneo over 11 years
-m
is clearly documented in myman grep
as a per-file option...whatgrep
are you using? -
pathikrit over 11 years
grep --v
saysgrep (BSD grep) 2.5.1-FreeBSD
on my Mountain Lion -
nneonneo over 11 yearsFunny, I'm on Lion and my
/usr/bin/grep
isgrep (GNU grep) 2.5.1
(and it does per-file-m
). -
pathikrit over 11 yearsDid not print as expected. Here's what I got:
bash-3.2$ find . -name \*.coffee -exec awk '/re/ {print;exit}' {} \; config = require './config' session_secret: 'nyan cat'
-
ruakh over 11 yearsOne problem with that is, at least on my system, you can't really pipe the output of
find -exec
tohead
, because theSIGPIPE
goes to the process thatfind
launches, rather than tofind
itself, so it just keeps re-launching the program long after it's found two matches. -
ghoti over 11 yearsUpdated the answer to include filenames, as well as globstar as an alternate way to recurse. As for piping to
head
, why would you need to do that here? I don't see a requirement for that in the question. Theawk
script takes care of stopping after the first match in each file. -
ghoti over 11 years@wrick - just a note about globstar; I gather you're using an older
bash
, since your prompt isbase-3.2$
. Globstar was added to bash in version 4.0. You can either skip globstar, or install a more recentbash
using MacPorts. Also, I don't see the problem with your output. While comments suck for code/output formatting, it appears you're seeing lines withre
in them. If you like, you can edit your question to include a better formatted result for this attempt. -
Schwern over 11 yearsI can confirm, /usr/bin/grep on OS X 10.8.2 is
(BSD grep) 2.5.1-FreeBSD
and its-m
is global, not per file. GNU grep is per file. nneonneo, you must have overwritten/usr/bin
with GNU tools. @wrick I'd suggest getting GNU tools, the BSD ones that OS X comes with are kinda janky. It will make your life much easier in the long run. Use MacPorts or homebrew. -
nneonneo over 11 years@Schwern: I guess I must've, but I don't recall ever doing it :-\
-
pathikrit over 11 yearsDone - still did not work - did not print results from other file
-
Graham over 11 yearsThis won't work if there are special characters in filenames. See the parsing ls problem.
-
Graham over 11 years@Schwern - I would NEVER recommend overwriting system-provided tools with GNU ones. The system ones get updated by Apple. Much better to put GNU tools in a different location, then adjust your $PATH accordingly.
-
Schwern over 11 years@Graham Use
find -print0
andxargs -0
, as in my answer, to get around that. -
nneonneo over 11 years@Graham: easily amended, use
find -print0
andxargs -0
. -
ghoti over 11 years@wrick, what was the output you were expecting?
-
pathikrit over 11 yearsThe GNU one makes so much more sense than the BSD interpretation of
-m
IMO - thanks for catching this -
Schwern over 11 yearsThis solution shares nneonneo's problem, it only works on GNU grep. BSD grep's -m is global, not per file.
-
Schwern over 11 years@Graham I suspect you meant "I don't recommend overwriting". MacPorts and Homebrew take care of all that, they live on their own paths and handle the environment adjustments. Violent agreement.
-
oyss over 11 years@Graham example please.I'm not familiar with this issue. simply test with filenames like test1?.txt still ok.
-
nneonneo over 11 yearsOK, this is starting to get a bit weird. I consulted a friend, who also uses 10.7, and has GNU
grep
in/usr/bin/grep
. Furthermore, theman
page forgrep
on Apple's site says it's GNUgrep
. Did Apple suddenly change the default in 10.8? -
pathikrit over 11 yearsI expect 1 unique file per line (question has what I expect). Thanks!
-
oyss over 11 years@Schwern xargs is per file. the grep is executed on each file find matches. there should not be an global -m issue.
-
Graham over 11 years@oyss - create a file with
touch foo.coffee\ bar.coffee
. It's a single file, with a space in the filename. Using xargs the way you've suggested, xargs will interpret it as two files. Check the link on my first comment for more details. -
Schwern over 11 years@nneonneo Yep, I'm seeing complaints on the internet about grep being reverted to BSD in 10.8. Reason number 230823 to use MacPorts or Homebrew.
-
pathikrit over 11 yearsMaybe you have an older Mac? Found this: github.com/schmittjoh/JMSDiExtraBundle/issues/41
-
nneonneo over 11 years@wrick: Yes I do. I have 10.7. So it seems that
grep
was indeed "downgraded" to BSDgrep
in 10.8. -
Schwern over 11 years@oyss Tested it with the BSD and GNU greps on my system. I think I see the confusion. xargs does not call the command once for each file, but just once with a list of files. Only if xargs thinks the file list is going to overflow the exec buffer will it do multiple calls. You can test this by writing a program which prints each time it starts and then prints all its arguments.
-
ghoti over 11 yearsWell, I haven't seen your input, so I can't tell whether the output matches. "re" is in "REquires", but it's also in "secREt". Did you try with the updated
find
line that includesFILENAME
? -
ruakh over 11 years@ghoti: Re: "As for piping to
head
, why would you need to do that here?": The question asks to "control total number of matches", and gives the example of-m2
to limit the total number of matches to two. -
Graham over 11 yearsFreeBSD has been using GNU grep 2.5.1 for years. OSX pre 10.8 and 1.8 both use GNU grep 2.5.1 as well. In FreeBSD 9.0 and OSX 10.8 the behaviour I see with
-m 1
is one line per file. @Schwern - please re-check your "confirmed" results, as I can't replicate them. -
ghoti over 11 yearsAh, right you are. So the correct answer to the OP's initial question is simply "no".
-
pathikrit over 11 yearsI confirm schwern's results - for me -m does this: github.com/schmittjoh/JMSDiExtraBundle/issues/41
-
Schwern over 11 years
-
Graham over 11 years@Schwern - here are my results in FreeBSD 9.0-RELEASE: pastebin.com/RiECy9CE This is the same version of grep reported in OSX 10.8. I don't have access to an OSX 10.8 box just at the moment, but I believe
-m1
would be treated the same way on it. Do you see anything wrong with my test? -
Schwern over 11 years@Graham Your test is fine... except its using GNU grep. We know GNU grep works. The only point of contention is what OS X ships with. Prior to 10.8 it was GNU grep. 10.8 introduced BSD grep as confirmed on my machine and all the posts I linked to previously.
/usr/bin/grep --version
grep (BSD grep) 2.5.1-FreeBSD
uname -s -r
Darwin 12.2.0
. Are you sure you're looking at/usr/bin/grep
on your OS X 10.8 machine and you haven't overwritten it? -
ghoti over 11 yearsI, for one, have never heard of
-m
acting globally rather than per-file. If this happens in OSX 10.8, it's an Apple-ism, not something to do with the port of GNU grep that is part of FreeBSD. (Note that if there really is such a thing as "BSD grep"; it's not from FreeBSD. FreeBSD still uses a port of GNU grep 2.5.1, as it (and OSX) has for years.) -
Graham over 11 yearsOkay, I can confirm that OSX 10.8.2 behaves differently from FreeBSD. @Schwern, sorry to doubt you, but as ghoti said, this isn't the how grep behaves in any BSD operating system I've seen before; it seems to be unique to OSX.
-
Ross Brasseaux over 9 yearsAll this time and I finally realize I was asking the wrong question. Thanks!
-
Megan B about 7 yearsThis is exactly what I was looking for, and definitely the best answer to this question! Thanks :)
-
ceiling cat almost 6 yearsThis is the easiest method. For the lazy, the option
-l
is the abbreviation of--files-with-matches
. So you don't need both. -
Dalker almost 6 yearsthis is definitely simpler than the accepted answer
-
Moltres over 5 yearsMan thank you so much for this! Definitely agree with @Dalker