Why do newline characters get lost when using command substitution?
Solution 1
The newlines were lost, because the shell had performed field splitting after command substitution.
In POSIX Command Substitution section:
The shell shall expand the command substitution by executing command in a subshell environment (see Shell Execution Environment) and replacing the command substitution (the text of command plus the enclosing "$()" or backquotes) with the standard output of the command, removing sequences of one or more characters at the end of the substitution. Embedded characters before the end of the output shall not be removed; however, they may be treated as field delimiters and eliminated during field splitting, depending on the value of IFS and quoting that is in effect. If the output contains any null bytes, the behavior is unspecified.
Default IFS
value (at least in bash
):
$ printf '%q\n' "$IFS"
$' \t\n'
In your case, you don't set IFS
or using double quotes, so newlines character will be eliminated during field splitting.
You can preserve newlines, example by settingIFS
to empty:
$ IFS=
$ a=$(cat links.txt)
$ echo "$a"
link1
link2
link3
Solution 2
Newlines get swapped out at some points because they are special characters. In order to keep them, you need to make sure they're always interpreted, by using quotes:
$ a="$(cat links.txt)"
$ echo "$a"
link1
link2
link3
Now, since I used quotes whenever I was manipulating the data, the newline characters (\n
) always got interpreted by the shell, and therefore remained. If you forget to use them at some point, these special characters will be lost.
The very same behaviour will occur if you use your loop on lines containing spaces. For instance, given the following file...
mypath1/file with spaces.txt
mypath2/filewithoutspaces.txt
The output will depend on whether or not you use quotes:
$ for i in $(cat links.txt); do echo $i; done
mypath1/file
with
spaces.txt
mypath2/filewithoutspaces.txt
$ for i in "$(cat links.txt)"; do echo "$i"; done
mypath1/file with spaces.txt
mypath2/filewithoutspaces.txt
Now, if you don't want to use quotes, there is a special shell variable which can be used to change the shell field separator (IFS
). If you set this separator to the newline character, you will get rid of most problems.
$ IFS=$'\n'; for i in $(cat links.txt); do echo $i; done
mypath1/file with spaces.txt
mypath2/filewithoutspaces.txt
For the sake of completeness, here is another example, which does not rely on command output substitution. After some time, I found out that this method was considered more reliable by most users due to the very behaviour of the read
utility.
$ cat links.txt | while read i; do echo $i; done
Here is an excerpt from read
's man page:
The read utility shall read a single line from standard input.
Since read
gets its input line by line, you're sure it won't break whenever a space shows up. Just pass it the output of cat
through a pipe, and it'll iterate over your lines just fine.
Edit: I can see from other answers and comments that people are quite reluctant when it comes to the use of cat
. As jasonwryan said in his comment, a more proper way to read a file in shell is to use stream redirection (<
), as you can see in val0x00ff's answer here. However, since the question isn't "how to read/process a file in shell programming", my answer focuses more on the quotes behaviour, and not the rest.
Solution 3
To add my emphasis, for
loops iterate over words. If your file is:
one two
three four
Then this will emit four lines:
for word in $(cat file); do echo "$word"; done
To iterate over the lines of a file, do this:
while IFS= read -r line; do
# do something with "$line" <-- quoted almost always
done < file
Solution 4
You can use read
from bash. Also look for the mapfile
while read -r link
do
printf '%s\n' "$link"
done < links.txt
Or using mapfile
mapfile -t myarray < links.txt
for link in "${myarray[@]}"; do printf '%s\n' "$link"; done
user3138373
Updated on September 18, 2022Comments
-
user3138373 almost 2 years
I have a text file named links.txt which looks like this
link1 link2 link3
I want to loop through this file line by line and perform an operation on every line. I know I can do this using while loop but since I am learning, I thought to use a for loop. I actually used command substitution like this
a=$(cat links.txt)
Then used the loop like this
for i in $a; do ###something###;done
Also I can do something like this
for i in $(cat links.txt); do ###something###; done
Now my question is when I substituted the cat command output in a variable a, the new line characters between link1 link2 and link3 are removed and is replaced by spaces
echo $a
outputs
link1 link2 link3
and then I used the for loop. Is it always that a new line is replaced by space when we do a command substitution??
Regards
-
jasonwryan over 9 yearsSee Bash FAQ 001...
-
Angel Todorov over 9 yearsUnquoted variables are subject to word splitting and filename expansion
-
Trevor Boyd Smith almost 8 years
-
G-Man Says 'Reinstate Monica' about 7 yearsIf you look closely, you'll see that this question is not a duplicate. This question is about the newlines between the lines of output from the command (i.e., at the ends of lines 1 through n −1). That question, as its title suggests, is about the newline at the end of the output from the command (i.e., at the end of the last line).
-
sancho.s ReinstateMonicaCellio almost 6 years
-
-
user3138373 over 9 yearsAlso let's say I am not using quotes, then when I am applying the for loop, is it implicit that variable i will hold the value as the first file until it reaches a space which tells it that first file ends??
-
cuonglm over 9 yearsI think
The newlines are replaced with spaces because that's how echo works
seems to be wrong. -
mikeserv over 9 years@cuonglm - it could be clearer, but the
\n
ewlines are replaced with field delimeters, andecho
replaces the field delimiters with spaces - it concatenates its arguments on spaces. That's howecho
works. -
mikeserv over 9 years
for
loops iterate over arguments. If you doIFS=\n; for word in cat file; do echo "$word"; done
you'll get two loops and two lines printed.$IFS
applies globally all of the time in much the same way as it does toread
- except that theread
/\n
ewline relationship is pretty special. -
Marek Zakrzewski over 9 yearsWith all do respect to John WH Smith, I'm not sure who is upvoting the answer.
for i in $(cat ..)
is wrong. See the comment ofjasonwryan
. That is the way how you read lines from a file. cat(1) is used to concatenate multiple files together. It should NOT be used to feed file data to processes. There are far better ways to achieve this. The application might take a file as argument (eg. grep ^foo file); or you might want to use file redirection (eg. read line < file). -
mikeserv over 9 years@val0x00ff - it's not wrong because you say it is, certainly. what is wrong about it?
-
yorkshiredev over 9 years@val0x00ff I used
cat
because that is what the OP was using in his question ;) The question isn't really about "how to read a file", but "why are newlines lost". As far as I'm concerned, I would always useread
, which is why I edited my answer afterwards to add this solution. I understand thatcat
shouldn't be used to read a single file, but since it isn't the main topic, I didn't spend too much time on it. -
Marek Zakrzewski over 9 yearsI upvoted the second explanation
while IFS
because it yet shows another way of feeding lines from a file. Again. @mikeserv aboutfor word in $(cat file)
is wrong and should not be used in bash scripts or any other form. Let me emphasise once again: Never do this: for x in $(command) orcommand
or $var. for-in is used for iterating arguments, not (output) strings. Instead, use a glob (eg. *.txt), arrays (eg. "${names[@]}") or a while-read loop (eg. while read -r line). See mywiki.wooledge.org/BashPitfalls#pf1 and mywiki.wooledge.org/DontReadLinesWithFor -
mikeserv over 9 years@val0x00ff - I don't think you understand -
$IFS
is about arguments. Specifically,$IFS
splits fields into arguments - that's its job. There are potential problems with that approach - but they are handled as easily asset -f; IFS=$delimiter
- that's all you need do. For example, you could do the very slowwhile read -r line
thing or you could doset -f; IFS=\n; set -- $(cat file)
. If you did that you'd get an array of the file's non-blank lines each in tact in$1 $2 $3... "$@"
. The wooledge wiki is typically an awful source of information - you should try to wean off of it. -
geirha over 9 years@mikeserv - it treats data as code, which is generally considered wrong in any language. Or, any language except bash, apparently.
-
mikeserv over 9 years@geirha - this is not a true statement at all. It delimits fields on specified delimiters. If it is such an unpopular behavior, how is it
awk
is so ubiquitous? From the POSIX rationale: If the IFS variable is unset or is <space> <tab> <newline>, the operation is equivalent to the way the System V shell splits words. Using characters outside the \s \n \t set yields the KornShell behavior, where each of the non- \s \n \t is significant. This behavior .. was taken from the way the original awk handled field splitting. -
geirha over 9 years@mikeserv - "Take the data in this file and split it into words based on the characters in IFS, then for each of those words that happen to contain glob characters, attempt to replace those words with matching filenames". That certainly doesn't sound like treating data as data.
-
mikeserv over 9 yearsYes - @geirha - globbing is a problem. That is a very excellent point. This is why the shell offers the
set -f
option. You can either expand filenames withset +f
or not do withset -f
. I specifically address that in my own answer here. And, as far as I can tell, it's the only one here that mentions it. -
Angel Todorov over 9 yearsTo properly set IFS to a newline, use ANSI-C quoting:
IFS=$'\n'
-- this (IFS=\n
) sets IFS to the letter "n" -
sancho.s ReinstateMonicaCellio almost 6 years
-
Mike Q over 5 yearsThe IFS=$'\n'; is needed for looping properly the quotes will not do line by line properly without it (BASH4+)
-
Oly Dungey almost 5 yearsNote: If you use
printf
instead ofecho
you avoid the IFS issue entirely -
cuonglm almost 5 years@OliverDungey it's not about
echo
orprintf
, it's about double quote"$a"
. the original question is using for loop, that's when field splitting occurs after command substitution.