Grep from the end of a file to the beginning
Solution 1
tac
only helps if you also use grep -m 1
(assuming GNU grep
) to have grep
stop after the first match:
tac accounting.log | grep -m 1 foo
From man grep
:
-m NUM, --max-count=NUM
Stop reading a file after NUM matching lines.
In the example in your question, both tac
and grep
need to process the entire file so using tac
is kind of pointless.
So, unless you use grep -m
, don't use tac
at all, just parse the output of grep
to get the last match:
grep foo accounting.log | tail -n 1
Another approach would be to use Perl or any other scripting language. For example (where $pattern=foo
):
perl -ne '$l=$_ if /foo/; END{print $l}' file
or
awk '/foo/{k=$0}END{print k}' file
Solution 2
The reason why
tac file | grep foo | head -n 1
doesn't stop at the first match is because of buffering.
Normally, head -n 1
exits after reading a line. So grep
should get a SIGPIPE and exit as well as soon as it writes its second line.
But what happens is that because its output is not going to a terminal, grep
buffers it. That is, it's not writing it until it has accumulated enough (4096 bytes in my test with GNU grep).
What that means is that grep
will not exit before it has written 8192 bytes of data, so probably quite a few lines.
With GNU grep
, you can make it exit sooner by using --line-buffered
which tells it to write lines as soon as they are found regardless of whether goes to a terminal or not. So grep
would then exit upon the second line it finds.
But with GNU grep
anyway, you can use -m 1
instead as @terdon has shown, which is better as it exits at the first match.
If your grep
is not the GNU grep
, then you can use sed
or awk
instead. But tac
being a GNU command, I doubt you'll find a system with tac
where grep
is not GNU grep
.
tac file | sed "/$pattern/!d;q" # BRE
tac file | P=$pattern awk '$0 ~ ENVIRON["P"] {print; exit}' # ERE
Some systems have tail -r
to do the same thing as GNU tac
does.
Note that, for regular (seekable) files, tac
and tail -r
are efficient because they do read the files backward, they're not just reading the file fully in memory before printing it backward (as @slm's sed approach or tac
on non-regular files would).
On systems where neither tac
nor tail -r
are available, the only options are to implement the backward-reading by hand with programming languages like perl
or use:
grep -e "$pattern" file | tail -n1
Or:
sed "/$pattern/h;$!d;g" file
But those mean finding all the matches and only print the last one.
Solution 3
Here is a possible solution that will find the location of first occurrence of pattern from last:
tac -s "$pattern" -r accounting.log | head -n 1
This makes use of the -s
and -r
switches of tac
which are as follows:
-s, --separator=STRING
use STRING as the separator instead of newline
-r, --regex
interpret the separator as a regular expression
Solution 4
Using sed
Showing some alternative methods to @Terdon's fine answer using sed
:
$ sed '1!G;h;$!d' file | grep -m 1 $pattern
$ sed -n '1!G;h;$p' file | grep -m 1 $pattern
Examples
$ seq 10 > file
$ sed '1!G;h;$!d' file | grep -m 1 5
5
$ sed -n '1!G;h;$p' file | grep -m 1 5
5
Using Perl
As a bonus here's a little easier notation in Perl to remember:
$ perl -e 'print reverse <>' file | grep -m 1 $pattern
Example
$ perl -e 'print reverse <>' file | grep -m 1 5
5
Related videos on Youtube
eric dykstra
Updated on September 18, 2022Comments
-
eric dykstra over 1 year
I have a file with about 30.000.000 lines (Radius Accounting) and I need to find the last match of a given pattern.
The command:
tac accounting.log | grep $pattern
gives what I need, but it's too slow because the OS has to first read the whole file and then send to the pipe.
So, I need something fast that can read the file from the last line to the first.
-
eric dykstra over 10 yearsI'm using tac because I need to find the last match of a given pattern. Using your suggestion "grep -m1" the execution time goes from 0m0.597s to 0m0.007s \o/. Thanks everybody!
-
camh over 10 yearsWhy do you say "tac [...] needs to process the entire file"? The first thing tac does is seek to the end of the file and read a block from the end. You can verify this yourself with strace(1). When combined with
grep -m
, it should be quite efficient. -
Stéphane Chazelas over 10 yearsThat's (especially the
sed
one) likely to be several orders of magnitude slower thangrep 5 | tail -n1
orsed '/5/h;$!d;g'
. It will also potentially use a lot of memory. It's not a lot more portable as you're still using GNU'sgrep -m
. -
terdon over 10 years@camh when combined with
grep -m
it is. The OP was not using-m
so both grep and tac were processing the whole thing. -
Arlene Mariano almost 6 yearsCould you please expand on the meaning of the
awk
line? -
Arlene Mariano almost 6 yearsOK, Terdon, thanks you. So I understand the full file must/will be read (browsed/processed) with your
awk
method, even when we are only searching for the last appearance. -
ychaouche over 5 yearsExcept you will lose everything that is between the start of the line and the pattern.
-
Scott Prive about 5 years+1 million for the perl -ne example because it uses NO PIPES. That's hugely important if you're running the command via Ansible (as pipes will contaminate $? exit status).
-
Admin about 2 yearsGiving
grep
files to read will make it ignore its standard input stream. In your first example command, the input fromtac
would be ignored. The user in the question only has a single log file, and their issue is the speed of reading it withtac
to find the last match of their pattern. You do not seem to address this.