How do I use tee to redirect to grep

18,642

Solution 1

$ ps aux | tee >(head -n1) | grep syslog
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND 
syslog     806  0.0  0.0  34600   824 ?        Sl   Sep07   0:00 rsyslogd -c4

The grep and head commands start at about the same time, and both receive the same input data at their own leisure, but generally, as data becomes available. There are some things that can introduce the 'unsynchronized' output which flips lines; for example:

  1. The multiplexed data from tee actually gets sent to one process before the other, depending primarily on the implementation of tee. A simple tee implementation will read some amount of input, and then write it twice: Once to stdout and once to its argument. This means that one of those destinations will get the data first.

    However, pipes are all buffered. It is likely that these buffers are 1 line each, but they might be larger, which can cause one of the receiving commands to see everything it needs for output (ie. the grepped line) before the other command (head) has received any data at all.

  2. Notwithstanding the above, it's also possible that one of these commands receives the data but is unable to do anything with it in time, and then the other command receives more data and processes it quickly.

    For example, even if head and grep are sent the data one line at a time, if head doesn't know how to deal with it (or gets delayed by kernel scheduling), grep can show its results before head even gets a chance to. To demonstrate, try adding a delay: ps aux | tee >(sleep 1; head -n1) | grep syslog This will almost certainly output the grep output first.

$ ps aux | tee >(grep syslog) | head -n1
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND

I believe you often only get one line here, because head receives the first line of input and then closes its stdin and exits. When tee sees that its stdout has been closed, it then closes its own stdin (output from ps) and exits. This could be implementation-dependent.

Effectively, the only data that ps gets to send is the first line (definitely, because head is controlling this), and maybe some other lines before head & tee close their stdin descriptors.

The inconsistency with whether the second line appears is introduced by timing: head closes stdin, but ps is still sending data. These two events are not well-synchronized, so the line containing syslog still has a chance of making it to tee's argument (the grep command). This is similar to the explanations above.

You can avoid this problem altogether by using commands that wait for all input before closing stdin/exiting. For example, use awk instead of head, which will read and process all its lines (even if they cause no output):

ps aux | tee >(grep syslog) | awk 'NR == 1'

But note that the lines can still appear out-of-order, as above, which can be demonstrated by:

ps aux | tee >(grep syslog) | (sleep 1; awk 'NR == 1')

Hope this wasn't too much detail, but there are a lot of simultaneous things interacting with each other. Separate processes run simultaneously without any synchronization, so their actions on any particular run can vary; sometimes it helps to dig deep into the underlying processes to explain why.

Solution 2

grep syslog is not always shown as it depends on timing. When using shell pipeline, you are running commands almost simultaneously. But the key thing here is the word "almost". If ps finishes scanning all processes before grep is launched, it wont be on the list. You can get random results depending on the load of system etc.

Similar thing happens with your tee. It is run on background in subshell and it may be fired before or after grep. This is why the output order is inconsistent.

As for the tee question, it's behavior is quite strange. This is because it is not used in it's normal way. It is run without any arguments which means it should just copy data from it's stdin to stdout. But it's stdout is redirected to subshell running head (in first case) or grep (2nd case). But it is also piped to the next command. I think that what happens in this case is actually implementation dependent. For example on my bash 4.2.28, nothing is ever written to subshell stdin. On zsh, it works reliable the way you would like (printing both first line of ps and searched lines), each time I try,

Share:
18,642

Related videos on Youtube

Rqomey
Author by

Rqomey

Senior Support Engineer in virtzilla

Updated on September 18, 2022

Comments

  • Rqomey
    Rqomey over 1 year

    I don't have much experience of using tee, so I hope this is not very basic.

    After viewing one of the answers to this question I came across a strange beheviour with tee.

    In order for me to output the first line, and a found line, I can use this:

    ps aux | tee >(head -n1) | grep syslog
    USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
    syslog     806  0.0  0.0  34600   824 ?        Sl   Sep07   0:00 rsyslogd -c4
    

    However, the first time I ran this (in zsh) the result was in the wrong order, the column headers were below the grep results (this did not happen again however), so I tried to swap the commands around:

    ps aux | tee >(grep syslog) | head -n1
    USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
    

    Only the first line is printed, and nothing else! Can I use tee to redirect to grep, or am I doing this in the wrong manner?

    As I was typing this question, the second command actually worked once for me, I ran it again five times and then back to the one line result. Is this just my system? (I am running zsh within tmux).

    Finally, why with the first command is "grep syslog" not shown as a result (there is only one result)?

    For control here is the grep without the tee

    ps aux | grep syslog
    syslog     806  0.0  0.0  34600   824 ?        Sl   Sep07   0:00 rsyslogd -c4
    henry    2290  0.0  0.1  95220  3092 ?        Ssl  Sep07   3:12 /usr/bin/pulseaudio --start --log-target=syslog
    henry   15924  0.0  0.0   3128   824 pts/4    S+   13:44   0:00 grep syslog
    

    Update: It seems that head is causing the whole command to truncate (as indicated in the answer below) the below command is now returning the following:

    ps aux | tee >(grep syslog) | head -n1
    USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
    syslog     806
    
    • Admin
      Admin over 11 years
      Not a direct answer to your question but it would be much cleaner to just do something like ps aux | sed -n -e '1p' -e '/syslog/p'.
  • Rqomey
    Rqomey over 11 years
    Excellent Answer! I actually asked because I am interested in the underlying processes. When things are inconstant I find it interesting. Would there be a better way to run ps aux | tee >(grep syslog) | head -n1 which would stop head closing stdout. Wow, this command has started to give output now, but as would happen in line with your answer, it seems to be truncated USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND syslog 806
  • mrb
    mrb over 11 years
    You can use something that doesn't close stdin instead of head. I've updated the answer with this example: ps aux | tee >(grep syslog) | awk 'NR == 1'
  • Krzysztof Adamski
    Krzysztof Adamski over 11 years
    I think you're not quite right about tee writing twice. It does write only once as it technically does not have any arguments. Without arguments it only writes to it's stdout which is redirected to subshell.
  • mrb
    mrb over 11 years
    @KrzysztofAdamski, when you use >(cmd), the shell creates a named pipe and passes that as an argument to the command (tee). Then tee is writing to stdout (piped to awk) and also to that argument. It is the same as mkfifo a_fifo ; grep ... a_fifo in one shell and ps | tee a_fifo | awk ... in another.
  • Krzysztof Adamski
    Krzysztof Adamski over 11 years
    @mrb: It seems that my bash is not working this way (but zsh is). Do you have some links I could read more about this feature?
  • mrb
    mrb over 11 years
    @KrzysztofAdamski gnu.org/software/bash/manual/html_node/… — Try echo >(exit 0), which will echo the actual argument passed by the shell (in my case, it becomes /dev/fd/63). This should work the same on bash and zsh.
  • Krzysztof Adamski
    Krzysztof Adamski over 11 years
    @mrb: it is very interesting feature I didn't know before, thank you. It is working in some strange way in bash, however, see pastebin.com/xFgRcJdF. Unfortunately I don't have time to investigate this now but I will do it tomorrow.
  • Krzysztof Adamski
    Krzysztof Adamski over 11 years
    @mrb: it seems that bash does connect stdout from >(cmd) back to the pipe. Try running bash -c "ps aux | tee >(head -n 1) | grep PID". This is why it doesn't work on bash.