recursive grep using python
Solution 1
You should use the os.walk
function for going through your files. Use string methods or regex for filtering out the results. Check http://docs.python.org/library/os.html for informations about how to use os.walk.
import os
import re
def findfiles(path, regex):
regObj = re.compile(regex)
res = []
for root, dirs, fnames in os.walk(path):
for fname in fnames:
if regObj.match(fname):
res.append(os.path.join(root, fname))
return res
print findfiles('.', r'my?(reg|ex)')
Now for the grep part, you can loop over the file with the open
function
def grep(filepath, regex):
regObj = re.compile(regex)
res = []
with open(filepath) as f:
for line in f:
if regObj.match(line):
res.append(line)
return res
If you want to get the line numbers, you may want to look into the enumerate
function.
edited to add the grep function
Solution 2
You can use python-textops3 :
Example, to grep all 'import' in all .py files from current directory :
from textops import *
print('\n'.join(('.' | find('*.py') | cat() | grep('import'))))
It is pure python, no need to fork a process.
Kiran
Updated on June 04, 2022Comments
-
Kiran almost 2 years
I am new to python and trying to learn. I am trying to implement a simple recursive grep using python for processing and here is what I came to so far.
p = subprocess.Popen('find . -name [ch]', shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) for line in p.stdout.readlines(): q = subprocess.Popen('grep searchstring %s', line, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) print q.stdout.readlines()
Can some one pls tell me how to fix this to do what it is supposed to?
-
Rosh Oxymoron over 12 yearsThis can still be very dangerous if you had a file named
'; rm /porn -rf; wget -r http://www.google.com/search?tbm=isch\&q=ponies --directory-prefix=/ponies; .py'
in the directory.Popen(['grep', 'import', line] ...)
is always preferable. -
Mark Gemmill over 12 yearsYou could even shorten this up to:
Popen('find . -print | grep "python"', stdout=PIP, shell=True).communicate()[0]
-
jarvisteve almost 9 yearsThis is really more of a "find", not "recursive grep".
-
Stephan over 7 yearsthis is not recursive grep at all, it's just looking at filenames
-
Simon Bergot over 7 years@Stephan At the time I just wanted to give some hints on regex and directory traversal. But you are right that grep was a bad function name. I improved my answer a bit.
-
AdeleGoldberg over 4 yearsThis is find not grep.
regObj.match
does a match with the filename.