How can I grep while avoiding 'Too many arguments'
Solution 1
Run several instances of grep. Instead of
grep -i [email protected] 1US* | awk '{...}' | xargs rm
do
(for i in 1US*; do grep -li user@domain "$i"; done) | xargs rm
Note the -l flag, since we only want the file name of the match. This will both speed up grep (terminate on first match) and makes your awk script unrequired. This could be improved by checking the return status of grep and calling rm, not using xargs (xargs is very fragile, IMO). I'll give you the better version if you ask.
Hope it helps.
Solution 2
you can use find
to find all files which name's starting with the pattern '1US'. Then you can pipe the output to xargs which will take care, that the argument list will not growing to much and handle the grep call. Note that I've used a nullbyte to separate filenames for xargs. This avoids problems with problematic file names. ;)
find -maxdepth 1 -name '1US*' -printf '%f\0' | xargs -0 grep -u user@domain | awk ...
Solution 3
The -exec
argument to find
is useful here, I've used this myself in similar situations.
E.g.
# List the files that match
find /path/to/input/ -type f -exec grep -qiF [email protected] \{\} \; -print
# Once you're sure you've got it right
find /path/to/input/ -type f -exec grep -qiF [email protected] \{\} \; -delete
Solution 4
Using xargs is more efficient than using "find ... -exec grep" because you have less process creations etc.
One way to go about this would be:
ls 1US* | xargs grep -i [email protected] | awk -F: '{print $1}' | xargs rm
But easier would be:
find . -iname "1US*" -exec rm {} \;
Justin S
Updated on July 10, 2022Comments
-
Justin S almost 2 years
I was trying to clean out some spam email and ran into an issue. The amount of files in queue, were so large that my usual command was unable to process. It would give me an error about too many arguments.
I usually do this
grep -i [email protected] 1US* | awk -F: '{print $1}' | xargs rm
1US* can be anything between 1US[a-zA-Z]. The only thing I could make work was running this horrible contraption. Its one file, with 1USa, 1USA, 1USb etc, through the entire alphabet. I know their has to be a way to run this more efficiently.
grep -s $SPAMMER /var/mailcleaner/spool/exim_stage1/input/1USa* | awk -F: '{print $1}' | xargs rm grep -s $SPAMMER /var/mailcleaner/spool/exim_stage1/input/1USA* | awk -F: '{print $1}' | xargs rm
-
chepner about 11 years
-print0
is a shortcut for-printf '%f\0'
. -
chepner about 11 yearsThis just shifts the too-long argument list from the
grep
to the for statement. -
Guido about 11 yearsBut those are not actual arguments, they're inner to bash. I just made a folder with 1M files and tested: guido@solid:~/a$ ls * -bash: /bin/ls: Argument list too long guido@solid:~/a$ for i in *; do ls $i; done 1 10 100 ....
-
Justin S about 11 yearsHow would I get around xargs?
-
Guido almost 11 years
(for i in 1US*; do if grep -qi user@domain "$i"; then rm "$i"; fi; done)
. This would check the return value from grep to see if there was a match and delete the file if so. Also added the -q option to supress output of grep (we don't need to pipe it and we don't want it). If you ever get a 'too many arguments' error again (I tried with a million files with no problem, but just in case) you'd be better off usingfind
to delete the files.