Why pipe to cat only to redirect?
Solution 1
cat file | wc | cat > file2
would usually be two useless uses of cat
as that's functionally equivalent to:
< file wc > file2
However, there may be a case for:
cat file | wc -c
over
< file wc -c
That is to disable the optimisation that many wc
implementations do for regular files.
For regular files, the number of bytes in the file can be obtained without having to read the whole content of the file, but just doing a stat()
system call on it and retrieve the size as stored in the inode.
Now, one may want the file to be read for instance because:
-
the
stat()
information cannot be trusted (like for some files in/proc
or/sys
on Linux):$ < /sys/class/net/lo/mtu wc -c 4096 $ cat /sys/class/net/lo/mtu | wc -c 6
- one wants to check how much of the data can be read (like in case of a failing hard drive).
- one just wants to obtain benchmarks on how fast the data can be read.
- one wants for the content of the file to be cached in memory.
Of course, those are exceptions. In the general case, you'd rather use < file wc -c
for performance reasons.
Now, you can imagine even more far fetched scenarios where one may want to use: cat file | wc | cat > file2
:
- maybe
wc
has an apparmor profile or other security mechanism that prohibits it from reading or writing to files while it's allowed forcat
(that would be unheard of) - maybe
cat
is able to deal with large (as in > 232 bytes) files, but notwc
on that system (things like that have been needed for some commands on some systems in the past). - maybe one wants
wc
(and the firstcat
) to run and read the whole file (and be killed at the very last minute) even iffile2
can't be open for writing. - maybe one wants to hide the failure (exit status) of opening or reading the content of
file
. Thoughwc < file > file2 || :
would make more sense. - maybe one wants to hide (from the output of
lsof
(list open files)) the fact that he's getting a word count fromfile
or that he's storing a word count infile2
.
Solution 2
Both of those examples are useless uses of cat. Both are equivalent to wc < file1 > file2
. There is no reason to use cat
in this example, unless you are using cat file
as a temporary stand-in for something that dynamically generates output.
Solution 3
Let's suppose prog
forks a new subprocess and exits, and the new subprocess writes something to its standard output and then exits.
Then the command
prog
won't wait for the subprocess to exit, and it will display the shell prompt early. But the command
prog | cat
will wait for an EOF on the stdin of cat
, which effectively waits for the subprocess to exit. So this is a useful use of cat
.
Related videos on Youtube
OJFord
Updated on September 18, 2022Comments
-
OJFord almost 2 years
I occasionally see things like:
cat file | wc | cat > file2
Why do this?
When will the results (or performance) differ (favourably) from simply:
cat file | wc > file2
-
OJFord almost 9 yearsHa, didn't even catch the first one. Thanks for link :) bookmarked to read the other "useless uses of ~" later.
-
alephzero almost 9 yearsThe first usage of
cat
is not necessarily useless here. The commandwc file
prints the counters followed by the name of the file. The commandcat file | wc
does not print the name of the file. The secondcat
is useless.wc file1 file2
prints two lines of counts, one for each file (plus the file names).cat file1 file2 | wc
prints one line with the total counts, and no file names. -
Lily Chung almost 9 years@alephzero Reread the answer-
cat file | wc
is equivalent towc < file1
. -
Nate Eldredge almost 9 years@IstvanChung: Interestingly, on my system they're actually not equivalent.
cat file |wc
separates the line/word/character counts with more spaces thanwc <file
does. I don't know why. -
R.. GitHub STOP HELPING ICE almost 9 years@IstvanChung: They're not equivalent.
wc < file1
causeswc
to run with stdin being a file descriptor for a regular seekable, mmappable file,file1
.cat file1 | wc
causeswc
to run with a non-seekable pipe on stdin. -
Reid almost 9 years+1 for the last sentence. Often, a "useless"
cat
is a handy placeholder to be able to pop other commands in and out without rearranging the pipeline. -
jimmij almost 9 yearsValid point, at least for
bash
. To test it one may run( (while ((i<10)); do echo $((i++)); sleep 1; done) & exit ; ) | cat
with and without finalcat
. -
Iwillnotexist Idonotexist almost 9 years@alephzero +1, but, danger danger: If
file1
does not end in a linefeed, thencat file1 file2 | wc
will count one fewer lines, and potentially one fewer words, than wouldwc file1 file2
(which sees the "break" betweenfile1
andfile2
). -
Stéphane Chazelas almost 9 years
< file wc > file2
is Bourne and POSIX and also works in (t)csh, rc and es. The only shell I could find that doesn't support it isfish
(which is the most modern of them all). It even worked in the pre-Bourne sh of Unix V1 in 1970! -
Ian Ringrose almost 9 years@StéphaneChazelas, you are assuming that scripts only have to work on "unix"!
-
Stéphane Chazelas almost 9 yearsAs far as this unix.stackexchange.com Q&A site is concerned, that's Unix-like systems yes (though other POSIX systems are also covered). And as far as this question is concerned, you'd expect a system where
cat
(a typical Unix command) is available. -
Stéphane Chazelas almost 9 yearsOn the other hand, with shells like the Bourne shell, AT&T ksh or yash,
prog | cat
could return beforeprog
has returned (In those shells, it will return as soon ascat
returns, which will happen as soon asprog
(and its children if any) has closed all its fds to the pipe)). Try for instance withprog
beingsh -c 'echo A; exec >&-; sleep 2; echo >&2 B'
.