Using an index to make grep faster?
Solution 1
what about cscope, does this match your shoes?
Allows searching code for:
- all references to a symbol
- global definitions
- functions called by a function
- functions calling a function
- text string
- regular expression pattern
- a file
- files including a file
Solution 2
Full-text indexing
There are tools such as recoll, swish-e and sphinx but you'd have to check if they can support the sort of search criteria you need.
Recoll
Recoll is a personal full text search tool for Unix/Linux.
Swish-e
Swish-e is a fast, flexible, and free open source system for indexing collections of Web pages or other files.
Sphinx
Sphinx lets you either batch index and search data stored in an SQL database, NoSQL storage, or just files quickly and easily
grep
I'm surprised grep is as slow as you describe, can you reduce the number of files being searched? For example when I only need to search the source files for one executable (out of many in a project) I feed grep the names from a command that lists the source files for that program:
grep expression `sources myprogram`
sources
is a program specific to my development environment but you may have (or be able to construct) something equivalent.
I'm assuming you've tried obvious techniques such as
find /foo/myproject -name "*.c" -exec fgrep -l searchtext
I've read a suggestion that the -P
option of current grep
can speed up searches significantly.
Solution 3
You could copy your codebase on a RAM disk.
Solution 4
grep, no. But there are several programs which use indexes and aimed at code base. ctags
(there is a version provided with vim), etags
(aimed for use with emacs), global
(more independent of the editor) are the one I'm thinking about now but there are probably other.
Solution 5
if you want to use a fulltext search engine .. use one:
Related videos on Youtube
Peltier
Author of autojump, the fastest way to move around your filesystem from the command line.
Updated on September 18, 2022Comments
-
Peltier over 1 year
I find myself grepping the same codebase over and over. While it works great, each command takes about 10 seconds, so I am thinking about ways to make it faster.
So can
grep
use some sort of index? I understand an index probably won't help for complicated regexps, but I use mostly very simple patters. Does an indexer exist for this case?EDIT: I know about ctags and the like, but I would like to do full-text search.
-
Michał Šrajer over 12 yearsAre you using recursive oprtion for grep or some find/xargs like way?
-
Peltier over 12 years@Michał : yes, -R
-
-
Peltier over 12 yearsI use ctags, but isn't that limited to searching function names? I want to do full-text search.
-
Peltier over 12 yearsAFAIK locate is only for filenames. recoll would work, but I would prefer a command-line tool. The code base is pretty big, and since I'm looking for a string, I don't know where it is, so it's hard to limit the number of files to be searched :)
-
Peltier over 12 yearsThat's always an option, but I was wondering if a more lightweight, quick and dirty grep speedup option would exist.
-
user5249203 over 12 yearsI think swish-e is command-line. I haven't tried any (grep is fast enough on my projects)
-
akira over 12 years'more lightweight' but 'want to have my stuff fully indexed' are a bit of 2 extremes :) ctags is the best match for what you want, if you just want to go quick an dirty. with everything else you end up using a real fulltext-search-engine. eg, 'recoll' mentioned in @RedGrittyBrick answer is using xapian as the backend.
-
Peltier over 12 yearsThey're not necessarily incompatible. Imagine if ctags had a --full-text option, for instance, and grep a --tag-file option. Of course the fact that it could exist doesn't mean that it does :)
-
Peltier over 12 yearsThat could be what I'm looking for, I'll take a look. Thanks!
-
Peltier over 11 yearsAck is pretty cool. But I really doubt it's any faster than grep, since it is based on the same mechanisms.
-
neves over 6 yearsIt looks like it just works well for C, maybe C++ and Java