Why does redirecting the output of a file to itself produce a blank file?
Solution 1
When you use >
, the file is opened in truncation mode so its contents are removed before the command attempts to read it.
When you use >>
, the file is opened in append mode so the existing data is preserved. It is however still pretty risky to use the same file as input and output in this case. If the file is large enough not to fit the read input buffer size, its size might grow indefinitely until the file system is full (or your disk quota is reached).
Should you want to use a file both as input and output with a command that doesn't support in place modification, you can use a couple of workarounds:
-
Use an intermediary file and overwrite the original one when done and only if no error occurred while running the utility (this is the safest and more common way).
fold foo.txt > fold.txt.$$ && mv fold.txt.$$ foo.txt
-
Avoid the intermediary file at the expense of a potential partial or complete data loss should an error or interruption happen. In this example, the contents of
foo.txt
are passed as input to a subshell (inside the parentheses) before the file is deleted. The previous inode stays alive as the subshell is keeping it open while reading data. The file written by the inner utility (herefold
) while having the same name (foo.txt
) points to a different inode because the old directory entry has been removed so technically, there are two different "files" with the same name during the process. When the subshell ends, the old inode is released and its data is lost. Beware to make sure you have enough space to temporarily store both the old file and the new one at the same time otherwise you'll lose data.(rm foo.txt; fold > foo.txt) < foo.txt
Solution 2
The file is opened for writing by the shell before the application has a chance to read it. Opening the file for writing truncates it.
Solution 3
In bash, the stream redirection operator ... > foo.txt
empties foo.txt
before evaluating the left operand.
One might use command substitution and print its result as a workaround. This solution takes less additional characters than in other answers:
printf '%s\n' "$(less foo.txt)" > foo.txt
Beware: This command does not preserve any trailling newline(s) in foo.txt
. Have a look in the comment section below for more information
Here, the command substitution $(...)
is evaluated before the stream redirection operator >
, hence the preservation of information.
seewalker
Updated on September 18, 2022Comments
-
seewalker almost 2 years
Why does redirecting the output of a file to itself produce a blank file?
Stated in Bash, why do
less foo.txt > foo.txt
and
fold foo.txt > foo.txt
produce an empty
foo.txt
? Since an append such asless eggs.py >> eggs.py
produces a two copies of the text ineggs.py
, one might expect that an overwrite would produce one copy of the text.Note, I'm not saying this is a bug, it is more likely a pointer to something deep about Unix.
-
Scott - Слава Україні about 5 yearsAddressed in U&L’s canonical What are the shell's control and redirection operators? question.
-
-
slhck about 11 years
sponge
from moreutils can also help.fold foo.txt | sponge foo.txt
– orfold foo.txt | sponge !$
should also do. -
jlliagre about 11 years@slhck Indeed, sponge could do the job too. However, being neither specified by POSIX nor mainstream in Unix like OSes, it is unlikely to be present.
-
slhck about 11 yearsIt's not like it can't be made present though ;)
-
Scott - Слава Україні about 5 years@KamilMaciorowski: Actually, there is
tmp=$(cmd; printf q); printf '%s' "${tmp%q}"
. But you missed another issue with this answer: it says “subshell” when it means “command substitution”. Yes, command substitutions are generally subshells, but not vice versa, and subshells, in general, are no help for this problem. -
ljleb about 5 years@KamilMaciorowski I feel so bad for missing all of this. Thanks for pointing all of this. For your (4)th point: would backquotes do the trick i.e. preserve trailing newline(s)?
-
ljleb about 5 years@Scott thanks for your reply. I changed "subshell" for "command substitution". By the way, I wonder what's the exact difference between the two.
-
Kamil Maciorowski about 5 yearsNo, backquotes (backticks) strip trailing newline characters as well.
-
ljleb about 5 yearsAlright then, I added a warning message for now. I'll remove it if I find a solution.
-
Kamil Maciorowski about 5 yearsWell, now the answer is not that bad. Even with the warning there's one more problem: POSIX requires any non-empty text file to end with a newline character (otherwise the last line is incomplete). So
%s\n
as format would be better. But if the file is binary,%s
may be better. In any case you're risking the new content is not exactly what it should be. Scott's approach can fix this; it's far from being elegant though.