How do you use the command coproc in various shells?
Solution 1
co-processes are a ksh
feature (already in ksh88
). zsh
has had the feature from the start (early 90s), while it has just only been added to bash
in 4.0
(2009).
However, the behaviour and interface is significantly different between the 3 shells.
The idea is the same, though: it allows to start a job in background and being able to send it input and read its output without having to resort to named pipes.
That is done with unnamed pipes with most shells and socketpairs with recent versions of ksh93 on some systems.
In a | cmd | b
, a
feeds data to cmd
and b
reads its output. Running cmd
as a co-process allows the shell to be both a
and b
.
ksh co-processes
In ksh
, you start a coprocess as:
cmd |&
You feed data to cmd
by doing things like:
echo test >&p
or
print -p test
And read cmd
's output with things like:
read var <&p
or
read -p var
cmd
is started as any background job, You can use fg
, bg
, kill
on it and refer it by %job-number
or via $!
.
To close the writing end of the pipe cmd
is reading from, you can do:
exec 3>&p 3>&-
And to close the reading end of the other pipe (the one cmd
is writing to):
exec 3<&p 3<&-
You cannot start a second co-process unless you first save the pipe file descriptors to some other fds. For instance:
tr a b |&
exec 3>&p 4<&p
tr b c |&
echo aaa >&3
echo bbb >&p
zsh co-processes
In zsh
, co-processes are nearly identical to those in ksh
. The only real difference is that zsh
co-processes are started with the coproc
keyword.
coproc cmd
echo test >&p
read var <&p
print -p test
read -p var
Doing:
exec 3>&p
Note: This doesn't move the coproc
file descriptor to fd 3
(like in ksh
), but duplicates it. So, there's no explicit way to close the feeding or reading pipe, other starting another coproc
.
For instance, to close the feeding end:
coproc tr a b
echo aaaa >&p # send some data
exec 4<&p # preserve the reading end on fd 4
coproc : # start a new short-lived coproc (runs the null command)
cat <&4 # read the output of the first coproc
In addition to pipe based co-processes, zsh
(since 3.1.6-dev19, released in 2000) has pseudo-tty based constructs like expect
. To interact with most programs, ksh-style co-processes won't work, since programs start buffering when their output is a pipe.
Here are some examples.
Start the co-process x
:
zmodload zsh/zpty
zpty x cmd
(Here, cmd
is a simple command. But you can do fancier things with eval
or functions.)
Feed a co-process data:
zpty -w x some data
Read co-process data (in the simplest case):
zpty -r x var
Like expect
, it can wait for some output from the co-process matching a given pattern.
bash co-processes
The bash syntax is a lot newer, and builds on top of a new feature recently added to ksh93, bash, and zsh that provides a syntax to allow handling of dynamically-allocated file descriptors above 10.
bash
offers a basic coproc
syntax, and an extended one.
Basic syntax
The basic syntax for starting a co-process looks like zsh
's:
coproc cmd
In ksh
or zsh
, the pipes to and from the co-process are accessed with >&p
and <&p
.
But in bash
, the file descriptors of the pipe from the co-process and the other pipe to the co-proccess are returned in the $COPROC
array (respectively ${COPROC[0]}
and ${COPROC[1]}
. So…
Feed data to the co-process:
echo xxx >&"${COPROC[1]}"
Read data from the co-process:
read var <&"${COPROC[0]}"
With the basic syntax, you can start only one co-process at the time.
Extended syntax
In the extended syntax, you can name your co-processes (like in zsh
zpty co-proccesses):
coproc mycoproc { cmd; }
The command has to be a compound command. (Notice how the example above is reminiscent of function f { ...; }
.)
This time, the file descriptors are in ${mycoproc[0]}
and ${mycoproc[1]}
.
You can start more than one co-process at a time—but you do get a warning when you start a co-process while one is still running (even in non-interactive mode).
You can close the file descriptors when using the extended syntax.
coproc tr { tr a b; }
echo aaa >&"${tr[1]}"
exec {tr[1]}>&-
cat <&"${tr[0]}"
Note that closing that way doesn't work in bash versions prior to 4.3 where you have to write it instead:
fd=${tr[1]}
exec {fd}>&-
As in ksh
and zsh
, those pipe file descriptors are marked as close-on-exec.
But in bash
, the only way to pass those to executed commands is to duplicate them to fds 0
, 1
, or 2
. That limits the number of co-processes you can interact with for a single command. (See below for an example.)
yash process and pipeline redirection
yash
doesn't have a co-process feature per se, but the same concept can be implemented with its pipeline and process redirection features. yash
has an interface to the pipe()
system call, so this kind of thing can be done relatively easily by hand there.
You'd start a co-process with:
exec 5>>|4 3>(cmd >&5 4<&- 5>&-) 5>&-
Which first creates a pipe(4,5)
(5 the writing end, 4 the reading end), then redirects fd 3 to a pipe to a process that runs with its stdin at the other end, and stdout going to the pipe created earlier. Then we close the writing end of that pipe in the parent which we won't need. So now in the shell we have fd 3 connected to the cmd's stdin and fd 4 connected to cmd's stdout with pipes.
Note that the close-on-exec flag is not set on those file descriptors.
To feed data:
echo data >&3 4<&-
To read data:
read var <&4 3>&-
And you can close fds as usual:
exec 3>&- 4<&-
Now, why they are not so popular
hardly any benefit over using named pipes
Co-processes can easily be implemented with standard named pipes. I don't know when exactly named pipes were introduced but it's possible it was after ksh
came up with co-processes (probably in the mid 80s, ksh88 was "released" in 88, but I believe ksh
was used internally at AT&T a few years before that) which would explain why.
cmd |&
echo data >&p
read var <&p
Can be written with:
mkfifo in out
cmd <in >out &
exec 3> in 4< out
echo data >&3
read var <&4
Interacting with those is more straightforward—especially if you need to run more than one co-process. (See examples below.)
The only benefit of using coproc
is that you don't have to clean up of those named pipes after use.
deadlock-prone
Shells use pipes in a few constructs:
-
shell pipes:
cmd1 | cmd2
, -
command substitution:
$(cmd)
, - and process substitution:
<(cmd)
,>(cmd)
.
In those, the data flows in only one direction between different processes.
With co-processes and named pipes, though, it's easy to run into deadlock. You have to keep track of which command has which file descriptor open, to prevent one staying open and holding a process alive. Deadlocks can be tricky to investigate, because they may occur non-deterministically; for instance, only when as much data as to fill one pipe up is sent.
works worse than expect
for what it's been designed for
The main purpose of co-processes was to provide the shell with a way to interact with commands. However, it does not work so well.
The simplest form of deadlock mentioned above is:
tr a b |&
echo a >&p
read var<&p
Because its output doesn't go to a terminal, tr
buffers its output. So it won't output anything until either it sees end-of-file on its stdin
, or it has accumulated a buffer-full of data to output. So above, after the shell has output a\n
(only 2 bytes), the read
will block indefinitely because tr
is waiting for the shell to send it more data.
In short, pipes aren't good for interacting with commands. Co-processes can only be used to interact with commands that don't buffer their output, or commands which can be told not to buffer their output; for example, by using stdbuf
with some commands on recent GNU or FreeBSD systems.
That's why expect
or zpty
use pseudo-terminals instead. expect
is a tool designed for interacting with commands, and it does it well.
File descriptor handling is fiddly, and hard to get right
Co-processes can be used to do some more complex plumbing than what simple shell pipes allow.
that other Unix.SE answer has an example of a coproc usage.
Here's a simplified example: Imagine you want a function that feeds a copy of a command's output to 3 other commands, and then have the output of those 3 commands get concatenated.
All using pipes.
For instance: feed the output of printf '%s\n' foo bar
to tr a b
, sed 's/./&&/g'
, and cut -b2-
to obtain something like:
foo
bbr
ffoooo
bbaarr
oo
ar
First, it's not necessarily obvious, but there’s a possibility for deadlock there, and it will start to happen after only a few kilobytes of data.
Then, depending on your shell, you’ll run in a number of different problems that have to be addressed differently.
For instance, with zsh
, you'd do it with:
f() (
coproc tr a b
exec {o1}<&p {i1}>&p
coproc sed 's/./&&/g' {i1}>&- {o1}<&-
exec {o2}<&p {i2}>&p
coproc cut -c2- {i1}>&- {o1}<&- {i2}>&- {o2}<&-
tee /dev/fd/$i1 /dev/fd/$i2 >&p {o1}<&- {o2}<&- &
exec cat /dev/fd/$o1 /dev/fd/$o2 - <&p {i1}>&- {i2}>&-
)
printf '%s\n' foo bar | f
Above, the co-process fds have the close-on-exec flag set, but not the ones that are duplicated from them (as in {o1}<&p
). So, to avoid deadlocks, you’ll have to make sure they're closed in any processes that don't need them.
Similarly, we have to use a subshell and use exec cat
in the end, to ensure there's no shell process lying about holding a pipe open.
With ksh
(here ksh93
), that would have to be:
f() (
tr a b |&
exec {o1}<&p {i1}>&p
sed 's/./&&/g' |&
exec {o2}<&p {i2}>&p
cut -c2- |&
exec {o3}<&p {i3}>&p
eval 'tee "/dev/fd/$i1" "/dev/fd/$i2"' >&"$i3" {i1}>&"$i1" {i2}>&"$i2" &
eval 'exec cat "/dev/fd/$o1" "/dev/fd/$o2" -' <&"$o3" {o1}<&"$o1" {o2}<&"$o2"
)
printf '%s\n' foo bar | f
(Note: That won’t work on systems where ksh
uses socketpairs
instead of pipes
, and where /dev/fd/n
works like on Linux.)
In ksh
, fds above 2
are marked with the close-on-exec flag, unless they’re passed explicitly on the command line. That’s why we don't have to close the unused file descriptors like with zsh
—but it’s also why we have to do {i1}>&$i1
and use eval
for that new value of $i1
, to be passed to tee
and cat
…
In bash
this cannot be done, because you can't avoid the close-on-exec flag.
Above, it's relatively simple, because we use only simple external commands. It gets more complicated when you want to use shell constructs in there instead, and you start running into shell bugs.
Compare the above with the same using named pipes:
f() {
mkfifo p{i,o}{1,2,3}
tr a b < pi1 > po1 &
sed 's/./&&/g' < pi2 > po2 &
cut -c2- < pi3 > po3 &
tee pi{1,2} > pi3 &
cat po{1,2,3}
rm -f p{i,o}{1,2,3}
}
printf '%s\n' foo bar | f
Conclusion
If you want to interact with a command, use expect
, or zsh
's zpty
, or named pipes.
If you want to do some fancy plumbing with pipes, use named pipes.
Co-processes can do some of the above, but be prepared to do some serious head scratching for anything non-trivial.
Solution 2
Co-processes were first introduced in a shell scripting language with the ksh88
shell (1988), and later in zsh
at some point before 1993.
The syntax to launch a co-process under ksh is command |&
. Starting from there, you can write to command
standard input with print -p
and read its standard output with read -p
.
More than a couple of decades later, bash which was lacking this feature finally introduced it in its 4.0 release. Unfortunately, an incompatible and more complex syntax was selected.
Under bash 4.0 and newer, you can launch a co-process with the coproc
command, eg:
$ coproc awk '{print $2;fflush();}'
You can then pass something to the command stdin that way:
$ echo one two three >&${COPROC[1]}
and read awk output with:
$ read -ru ${COPROC[0]} foo
$ echo $foo
two
Under ksh, that would have been:
$ awk '{print $2;fflush();}' |&
$ print -p "one two three"
$ read -p foo
$ echo $foo
two
Solution 3
Here is another good (and working) example -- a simple server written in BASH. Please note that you would need OpenBSD's netcat
, the classic one won't work. Of course you could use inet socket instead of unix one.
server.sh
:
#!/usr/bin/env bash
SOCKET=server.sock
PIDFILE=server.pid
(
exec </dev/null
exec >/dev/null
exec 2>/dev/null
coproc SERVER {
exec nc -l -k -U $SOCKET
}
echo $SERVER_PID > $PIDFILE
{
while read ; do
echo "pong $REPLY"
done
} <&${SERVER[0]} >&${SERVER[1]}
rm -f $PIDFILE
rm -f $SOCKET
) &
disown $!
client.sh
:
#!/usr/bin/env bash
SOCKET=server.sock
coproc CLIENT {
exec nc -U $SOCKET
}
{
echo "$@"
read
} <&${CLIENT[0]} >&${CLIENT[1]}
echo $REPLY
Usage:
$ ./server.sh
$ ./client.sh ping
pong ping
$ ./client.sh 12345
pong 12345
$ kill $(cat server.pid)
$
Related videos on Youtube
slm
Worked in the tech field for over 20+ years. Started out learning basic on an Apple IIe then on a TRS-80. Been interested in computer hardware and software my entire life. Consider myself lucky that my hobby as a kid/adult is what I get to do everyday earning a living. You can learn more about me here. ============================================================ Stolen from @Mokubai: First, please put down the chocolate-covered banana and step away from the European currency systems. You may consider how to ask a question.
Updated on September 18, 2022Comments
-
slm over 1 year
Can someone provide a couple of examples on how to use
coproc
? -
slm almost 11 yearsCan you add some of the article to your answer? I was trying to get this topic covered in U&L since it seemed under represented. Thanks for your answer! Also notice I set the tag as Bash, not zsh.
-
Stéphane Chazelas almost 11 yearsThey're not special kinds of pipes, they are the same pipes as used with
|
. (that is use pipes in most shells, and socketpairs in ksh93). pipes and socketpairs are first-in, first-out, they're all FIFO.mkfifo
makes named pipes, coprocesses don't use named pipes. -
cblemuel almost 11 years@slm sorry for zsh... actually i work on zsh. I tend to do it sometimes with the flow. It works fine in Bash too...
-
cblemuel almost 11 years@ Stephane Chazelas I am pretty sure that i read it somewhere that it's I/O is connected with special kinds of pipes called FIFO...
-
slm almost 11 years@MunaiDasUdasin - if you can find the source add it as a reference.
-
jlliagre almost 11 years@MunaiDasUdasin as already stated, all pipes are FIFO by design. Otherwise they wouldn't have been called pipes in the first place.
-
cblemuel almost 11 years@jilliagre as i said "i read it somewhere" and i included the reference for it... and Stephane Chazelas already mentioned that pipes are FIFO by design...thank you for stating it again.... And for future references I got the fact that pipes are FIFO.... i am including the link again ...
-
Thomas Nyman almost 11 years@MunaiDasUdasin In Unix, named pipes are commonly called FIFOs, although all pipes are first-in-first-out by design. Named pipes differ from traditional, unnamed pipes in that named pipes can persist beyond the lifetime of the process that created them, while unnamed pipes persist only for the lifetime of the process. It seems that the
zsh
implementation of coprocesses involve a named pipep
, butbash
coprocesses, which the question is explicitly about, use traditional unnamed pipes. -
Stéphane Chazelas almost 11 years@ThomasNyman,
zsh
orksh
co-processes don't use named pipes. It's>&p
, not>p
. The>&p
is a special syntax that means, redirect to the co-process (using unnamed pipes). unnamed pipes can persist after the lifetime of the process (in its children, of in a process that has opened /proc/pid/fd/n on Linux for instance) -
Thomas Nyman almost 11 years@StephaneChazelas You are of course correct, although whereas unnamed pipes are mostly used for parent-child communication, opening them through
/proc/pid/fd/
is perhaps not typical usage. A better choice of words would probably have been that named pipes persist as long as the corresponding file system object exists, or the pipe is kept open by the processes accessing it, whereas unnamed pipes have no associated file system object (apart from the file descriptor in/proc/pid/fd/
), and persist only as long as they are kept open by processes. -
mklement0 about 9 yearsGreat answer indeed. I don't know when specifically it was fixed, but as of at least
bash 4.3.11
, you can now close coproc file descriptors directly, without the need for an aux. variable; in terms of the example in your answerexec {tr[1]}<&-
would now work (to close the coproc's stdin; note that your code (indirectly) tries to close{tr[1]}
using>&-
, but{tr[1]}
is the coproc's stdin, and must be closed with<&-
). The fix must have come somewhere between4.2.25
, which still exhibits the problem, and4.3.11
, which doesn't. -
Stéphane Chazelas almost 9 years@mklement0, thanks.
exec {tr[1]}>&-
does indeed seem to work with newer versions and is referenced in a CWRU/changelog entry (allow words like {array[ind]} to be valid redirection... 2012-09-01).exec {tr[1]}<&-
(or the more correct>&-
equivalent though that makes no difference as that just callsclose()
for both) doesn't close the coproc's stdin, but the writing end of the pipe to that coproc. -
mklement0 almost 9 yearsThanks for researching, clarifying, and thanks for updating your answer. Can I suggest that you mention the 4.3 thing not only in the code comment, but also in the text above, such as "a bit cumbersome in Bash versions before 4.3"?
-
Otheus over 7 yearsOne advantage over
mkfifo
is that you don't have to worry about race conditions and security for pipe access. You still have to worry about deadlock with fifo's. -
shub about 7 yearsAbout deadlocks: the
stdbuf
command can help to prevent at least some of them. I used it under Linux and bash. Anyway I believe @StéphaneChazelas is right in the Conclusion: the "head scratching" phase ended for me only when I switched back to named pipes. -
mosvy almost 5 yearsAs to why they didn't use named pipes in ksh: named pipes were introduced very early on, long before ksh (I have a unix system III running in an emulator, and it does have them), but they were adopted very late in bsd (in 44bsd), so they weren't a portable solution.
-
maxschlepzig almost 5 years'But in bash, the only way to pass those to executed commands is to duplicate them to fds 0, 1, or 2. That limits the number of co-processes you can interact with for a single command.' Really? This works for me
exec 3<&${p[0]}; cat /proc/$$/fd/3
in bash 4.4.23 (for a coproc named p). Also, I don't understand: 'In bash this cannot be done, because you can't avoid the close-on-exec flag.' - I mean can't you avoid it by duplicating the file descriptor (because the close-on-exec flag isn't duplicated ...)? -
maxschlepzig almost 5 yearsOk, regarding the 2nd part: I've straced a bash script and bash directly calls
FD_CLOEXEC
on the duplicated file descriptor withfcntl()
after thedup2()
call. -
Miles Rout almost 4 yearsTruly a brilliant answer
-
HappyFace about 3 years
-
ddekany about 3 yearsThere should be a huge warning that tells that if the co-process exists, it will concurrently unset the variables through which you access the file descriptors and PID. So, you must ensure that the consumer controls when the co-process will exit, or else the script will "randomly" fail (depending on timing). At least on bash (4.2.46) it does this.