How to stop sed from buffering?

10,080

Solution 1

An alternate means to stop sed from buffering is to run it through the s2p sed-to-Perl translator and insert a directive to have it command-buffered, perhaps like

BEGIN { $| = 1 }

The other reason to do this is that it gives you the more convenient notation from EREs instead of the backslash-annoying legacy BREs. You also get the full complement of Unicode properties, which is often critical.

But you donโ€™t need the translator for such a simple sed command. And you do not need both grep and sed, either. These all work:

perl -nle 'BEGIN{$|=1} if (/good:/) { s/.*:(.*)/I got: $1/; print }'

perl -nle 'BEGIN{$|=1} next unless /good:/; s/.*:(.*)/I got: $1/; print'

perl -nle 'BEGIN{$|=1} next unless /good:/; s/.*:/I got: /; print'

Now you also have access to the minimal quantifier, *?, +?, ??, {N,}?, and {N,M}?. These now allow things like .*? or \S+? or [\p{Pd}.]??, which may well be preferable.

Solution 2

I think I found it. For some reason, grep doesn't automatically do line buffering. I added a --line-buffered option to grep and now it responds immediately.

Solution 3

You only need to tell grep and sed to not bufferize lines:

grep --line-buffered 

and

sed -u

Solution 4

You can merge the grep into the sed like so:

exec 3> >(sed -une '/^good:/s//I got: /p')
echo "bad:data1">&3
echo "good:data2">&3

Unpacking that a bit: You can put a regexp (between slashes as usual) before any sed command, which makes it only be applied to lines that match that regexp. If the first regexp argument to the s command is the empty string (s//whatever/) then it will reuse the last regexp that matched, which in this case is the prefix, so that saves having to repeat yourself. And finally, the -n option tells sed to print only what it is specifically told to print, and the /p suffix on the s command tells it to print the result of the substitution.

The -e option is not strictly necessary but is good style, it just means "the next argument is the sed script, not a filename".

Always put sed scripts in single quotes unless you need to substitute a shell variable in there, and even then I would put everything but the shell variable in single quotes (the shell variable is, of course, double-quoted). You avoid a bunch of backslash-related grief that way.

Solution 5

On a Mac, brew install coreutils and use gstdbuf to control buffering of grep and sed.

Share:
10,080

Related videos on Youtube

Sousou
Author by

Sousou

Updated on February 11, 2021

Comments

  • Sousou
    Sousou almost 3 years

    I have a program that writes to fd3 and I want to process that data with grep and sed. Here is how the code looks so far:

    
    exec 3> >(grep "good:"|sed -u "s/.*:\(.*\)/I got: \1/")
    echo "bad:data1">&3
    echo "good:data2">&3
    

    Nothing is output until I do a

    exec 3>&-

    Then, everything that I wanted finally arrives as I expected:

    I got: data2
    

    It seems to reply immediately if I use only a grep or only a sed, but mixing them seems to cause some sort of buffering. How can I get immediate output from fd3?

  • zwol
    zwol almost 13 years
    grep expects to be fed either a huge file or a huge pipe-stream. Fully buffered is much more efficient in those cases. (Be glad you're on Linux. This OSX computer I'm typing this on doesn't have line-buffering options for either grep or sed.)
  • Stephen P
    Stephen P almost 13 years
    Great info -- I used to know all that!
  • Sousou
    Sousou almost 13 years
    Wow. I'm so sorry about your OSX machine. Even more sorry that OSX actually costs money. Less features, more money, how did Apple pull that off?
  • Sousou
    Sousou almost 13 years
    sed, awk, grep, perl..they are all so powerful. I wish it were all just one utility because I just don't have bandwidth to learn all of them.
  • zwol
    zwol almost 13 years
    The right way to think about it is, /bin/sh + everything in "coreutils" is one utility that happens to be divided into many executables. Awk and perl are their own things, though, and honestly, nowadays I don't think it's worth bothering to learn awk, because perl is strictly better, especially if you learn its command line options. (You used to have to worry about systems that didn't have perl, but I don't think that's relevant anymore.) Also, a critical skill in this area is knowing when to abandon shell and use perl or python for the whole thing instead.
  • zwol
    zwol almost 13 years
    OSX isn't really for command line hacking. I'm glad they kept the BSD underpinnings around, but on this computer I spend far more time in things which are not CLI.
  • Sousou
    Sousou almost 13 years
    Really? I love the BSD license concept, but from the little that I've seen, much of BSD software is missing several nice features. Are there some nice features in BSD not in GNU or are you just saying that BSD bash is nicer than AppleScript (or whatever Mac used to have)?
  • Sousou
    Sousou almost 13 years
    Yeah, I'm starting to notice the failing of awk myself. I really liked it's C-ish syntax, but perl definitely has the extra features when you need them. I'm still so-so on python. Does it have useful extras missing from perl?
  • zwol
    zwol almost 13 years
    Perl versus Python is almost entirely a matter of personal taste, IMO; both have nearly identical capabilities, and oodles of library. Personally I find Python a much better match for the way I want to think about problems that need nontrivial data structures, and nowadays I reach for it for text bashing (beyond what sed can do reasonably) as well, just because I'm more in practice with it. But I know plenty of people who feel just the opposite.
  • zwol
    zwol almost 13 years
    I think you misunderstand. I have a Mac ... mainly because my job standardized on Macs. And it is nice for the things it is actually optimized for, i.e. big GUI applications. I am a Unix weenie at heart, though, and I'm happy that there is a Unix shell environment for dinking around with, but if I want to do anything serious I go use my Linux box. The BSDness of that environment is accidental; that's just what Apple happened to use for their kernel. (Apple does have a serious hate-on for the GPL, alas. I do think the GNU shell environment is superior.)
  • Sousou
    Sousou almost 13 years
    I see. I was actually hoping there were some cool BSD features unknown to Linux geeks like me.
  • Sousou
    Sousou almost 13 years
    I actually went the perl route as you recommended. I was not aware of its usefulness in bash scripts. perl and bash definitely go well together.
  • kostix
    kostix almost 6 years
    JFTR, it's just stdbuf on GNU/Linux systems.

Related