How to 'grep' a continuous stream?

390,223

Solution 1

Turn on grep's line buffering mode when using BSD grep (FreeBSD, Mac OS X etc.)

tail -f file | grep --line-buffered my_pattern

It looks like a while ago --line-buffered didn't matter for GNU grep (used on pretty much any Linux) as it flushed by default (YMMV for other Unix-likes such as SmartOS, AIX or QNX). However, as of November 2020, --line-buffered is needed (at least with GNU grep 3.5 in openSUSE, but it seems generally needed based on comments below).

Solution 2

I use the tail -f <file> | grep <pattern> all the time.

It will wait till grep flushes, not till it finishes (I'm using Ubuntu).

Solution 3

I think that your problem is that grep uses some output buffering. Try

tail -f file | stdbuf -o0 grep my_pattern

it will set output buffering mode of grep to unbuffered.

Solution 4

If you want to find matches in the entire file (not just the tail), and you want it to sit and wait for any new matches, this works nicely:

tail -c +0 -f <file> | grep --line-buffered <pattern>

The -c +0 flag says that the output should start 0 bytes (-c) from the beginning (+) of the file.

Solution 5

In most cases, you can tail -f /var/log/some.log |grep foo and it will work just fine.

If you need to use multiple greps on a running log file and you find that you get no output, you may need to stick the --line-buffered switch into your middle grep(s), like so:

tail -f /var/log/some.log | grep --line-buffered foo | grep bar
Share:
390,223
Matthieu Napoli
Author by

Matthieu Napoli

I am a software engineer passionate about code and human interactions around it. I like to work with great people, learn and get things done. You can read more about me on my blog or on my GitHub profile. Here are some projects I'm working on: bref.sh: deploy PHP on AWS Lambda to create serverless applications PHP-DI - Dependency injection library for PHP externals.io @matthieunapoli

Updated on December 30, 2021

Comments

  • Matthieu Napoli
    Matthieu Napoli over 2 years

    Is that possible to use grep on a continuous stream?

    What I mean is sort of a tail -f <file> command, but with grep on the output in order to keep only the lines that interest me.

    I've tried tail -f <file> | grep pattern but it seems that grep can only be executed once tail finishes, that is to say never.

  • glglgl
    glglgl over 12 years
    Which can last quite a while, so try not to get impatient.
  • Matthieu Napoli
    Matthieu Napoli over 12 years
    How long can it take approximately?
  • tripleee
    tripleee over 12 years
    @Matthieu: Depends mainly on what you grep for, and how large the buffers are on your OS. If the grep only matches a short string every few hours, it will be days before the first flush.
  • XzKto
    XzKto over 12 years
    Tail doesn't use output buffering - grep does.
  • Peter V. Mørch
    Peter V. Mørch almost 12 years
    And this has the advantage that it can be used for many other commands besides grep.
  • Peter V. Mørch
    Peter V. Mørch almost 12 years
    However, as I've discovered after playing more with it, some commands only flush their output when connected to a tty, and for that, unbuffer (in the expect-dev package on debian) is king. So I'd use unbuffer over stdbuf.
  • XzKto
    XzKto almost 12 years
    @Peter V. Mørch Yes, you are right, unbuffer can sometimes work where stdbuf can't. But I think you are trying to find a 'magic' programm that will always fix your problems instead of understanding your problem. Creating a virtual tty is unrelated task. Stdbuf does exactly what we want (sets standard output buffer to give value), while unbuffer does a lot of hidden stuff that we may not want (compare interactive top with stdbuf and unbuffer). And there is really no 'magic' solution: unbuffer fails sometimes too, for example awk uses different buffer implementation (stdbuf will fail too).
  • Peter V. Mørch
    Peter V. Mørch almost 12 years
    "But I think you are trying to find a 'magic' programm that will always fix your problems instead of understanding your problem." - I think you're right! ;-)
  • tripleee
    tripleee about 9 years
    This is not correct; Awk out of the box performs line buffering, just like most other standard Unix tools. (Moreover, the {print $0} is redundant, as printing is the default action when a condition passes.)
  • Tor Klingberg
    Tor Klingberg about 9 years
    Some more info about stdbuf, `unbuffer, and stdio buffering at pixelbeat.org/programming/stdio_buffering
  • Michael Niemand
    Michael Niemand almost 9 years
    what happens if I tail a log file that gets rotated, while this is running? Will logrotate be able to rotate the file?
  • jcfrei
    jcfrei almost 9 years
    @MichaelNiemand you could use tail -F file | grep --line-buffered my_pattern
  • Colin
    Colin almost 9 years
    and make sure to take out your usual flags that you don't want like -r (for other dunces out there)
  • Michael Goldshteyn
    Michael Goldshteyn over 8 years
    No, grep does not do output buffering when the output is going to a tty device, as it clearly is in this answer. It does line buffering! This is the correct answer and should be the accepted answer. See my longer comment to the currently accepted (wrong) answer for more details.
  • raine
    raine about 8 years
    @MichaelGoldshteyn Take it easy. People upvote it because they find this page when they google "grep line buffered" and it solves a problem for them which may not exactly be the one posed as the question.
  • harry
    harry about 8 years
    That's not actually correct. If grep is the last command in the pipe chain, it will act as you explain. However, if it's in the middle it will buffer around 8k output at a time.
  • caesarsol
    caesarsol about 8 years
    @MichaelGoldshteyn it appears to be true in certain situations. For example, I'm executing the command remotely in an automatic ssh session (specifying the command in the arguments). Do you have a more complete explanation? thanks!
  • M. Justin
    M. Justin over 7 years
    @MichaelGoldshteyn — I suspect the issue here is that people confirm that it gives the correct output by running the command themselves, and then upvote it, not realizing that the "--line-buffered" part is completely superfluous.
  • sjas
    sjas over 7 years
    I came here trying to grep the output of strace. Without the --line-buffered, it won't work.
  • aneroid
    aneroid over 7 years
    This the solution which worked for me in Git Bash on Windows. (Deleted my similar answer below which included the optional - for grep, un-needed.)
  • Aasmund Eldhuset
    Aasmund Eldhuset over 7 years
    @MichaelGoldshteyn (and the upvoters of his comment): I have always had this problem with tail -f | grep, and --line-buffered solves it for me (on Ubuntu 14.04, GNU grep version 2.16). Where is the "use line buffering if stdout is a tty" logic implemented? In git.savannah.gnu.org/cgit/grep.git/tree/src/grep.c, line_buffered is set only by the argument parser.
  • AKS
    AKS about 7 years
    grep -C 3 <pattern>, replaces -A <N> and -B <N> if N is same.
  • rmeden
    rmeden about 7 years
    I don't think --line-bufferred is the default option, at least not on ssh connections. I've always had this problem and never new about ---line-bufferred until today... works great!
  • tripleee
    tripleee over 6 years
    The assumption that the next process started on the system ($BASHPID+1) will be yours is false in many situations, and this does nothing to solve the buffering problem which is probably what the OP was trying to ask about. In particular, recommending sed over grep here seems like merely a matter of (dubious) preference. (You can get p;q behavior with grep -m 1 if that's the point you are attempting to deliver.)
  • Richard Waite
    Richard Waite over 6 years
    @MichaelGoldshteyn I'm on macOS using BSD grep and without --line-buffered I get no output. However, after testing, it looks like GNU grep does what you describe. So like most things Unix, it depends on your platform's implementation. Since the question did not specify platform, your information appears to be false - after reviewing the code for BSD grep and comparing it to GNU grep, the behavior is definitely controlled by the --line-buffered option. It's just that only GNU grep flushes by default.
  • Wes Mason
    Wes Mason about 6 years
    @MichaelGoldshteyn as pointed out multiple times in reply to your longer answer your information only applies to GNU grep, not BSD or other implementations, so can you check your aggression and cock-suredness hey?
  • MUY Belgium
    MUY Belgium over 5 years
    Works, the sed command prints each lines as soon as there are ready, the grep command with --line-buffered did not. I sincerely do not understand the minus 1.
  • Christian Herr
    Christian Herr over 5 years
    It is heretofore established that buffering is the problem with grep. No special action is required to handle line buffering using sed, it is default behavior, hence my emphasis of the word stream. And true, there is no guarantee $BASHPID+1 will be the correct pid to follow, but since pid allocation is sequential and the piped command is assigned a pid immediately following, it is utterly probable.
  • shellter
    shellter almost 5 years
    @PeterV.Mørch "But I think you are trying to find a 'magic' ... Aren't we all ;-? Good luck to all.
  • Dudi Boy
    Dudi Boy over 2 years
    Excellent and exhaustive answer. Thanks