Why can't you use cat to read a file line by line where each line has delimiters

53,404

Solution 1

The problem is not in cat, nor in the for loop per se; it is in the use of back quotes. When you write either:

for i in `cat file`

or (better):

for i in $(cat file)

or (in bash):

for i in $(<file)

the shell executes the command and captures the output as a string, separating the words at the characters in $IFS. If you want lines input to $i, you either have to fiddle with IFS or use the while loop. The while loop is better if there's any danger that the files processed will be large; it doesn't have to read the whole file into memory all at once, unlike the versions using $(...).

IFS='
'
for i in $(<file)
do echo "$i"
done

The quotes around the "$i" are generally a good idea. In this context, with the modified $IFS, it actually isn't critical, but good habits are good habits even so. It matters in the following script:

old="$IFS"
IFS='
'
for i in $(<file)
do
   (
   IFS="$old"
   echo "$i"
   )
done

when the data file contains multiple spaces between words:

$ cat file
abc                  123,         comma
the   quick   brown   fox
jumped   over   the   lazy   dog
comma,   comma
$ 

Output:

$ sh bq.sh
abc                  123,         comma
the   quick   brown   fox
jumped   over   the   lazy   dog
comma,   comma
$

Without the double quotes:

$ cat bq.sh
old="$IFS"
IFS='
'
for i in $(<file)
do
   (
   IFS="$old"
   echo $i
   )
done
$ sh bq.sh
abc 123, comma
the quick brown fox
jumped over the lazy dog
comma, comma
$

Solution 2

You can use IFS variable to specific you want a newline as the field separator:

IFS=$'\n'
for i in `cat file`
do
   echo $i
done

Solution 3

cat filename | while read i
do
    echo $i
done

Solution 4

the for loop coupled with a change of the internal field separator(IFS) will read file as intended

for an input

abc 123, comma
the quick brown fox
jumped over the lazy dog
comma, comma

For loop coupled with an IFS change

old_IFS=$IFS
IFS=$'\n'
for i in `cat file`
do
        echo $i
done
IFS=$old_IFS

results in

abc 123, comma
the quick brown fox
jumped over the lazy dog
comma, comma

Solution 5

IFS - Internal field separator can be set to get what you want.

To read a whole line at once, use: IFS=""

Share:
53,404

Related videos on Youtube

Classified
Author by

Classified

Updated on April 02, 2021

Comments

  • Classified
    Classified about 3 years

    I have a text file that contains something like this:

    abc 123, comma
    the quick brown fox
    jumped over the lazy dog
    comma, comma
    

    I wrote a script

    for i in `cat file`
    do
       echo $i
    done
    

    For some reason, the output of the script doesn't output the file line by line but breaks it off at the commas, as well as the newline. Why is cat or "for blah in cat xyz" doing this and how can I make it NOT do this? I know I can use a

    while read line
    do
       blah balh blah
    done < file
    

    but I want to know why cat or the "for blah in" is doing this to further my understanding of unix commands. Cat's man page didn't help me and looking at for or looping in the bash manual didn't yield any answers (http://www.gnu.org/software/bash/manual/bashref.html). Thanks in advance for your help.

  • chepner
    chepner almost 11 years
    Simply use IFS= read -r line to preserve all whitespace in the line.
  • Jonathan Leffler
    Jonathan Leffler almost 11 years
    The only reason the spacing is 'lost' with the while loop is because you used echo $line rather than echo "$line". If spacing is important, enclose the variable reference in double quotes.
  • Charles Duffy
    Charles Duffy almost 11 years
    As chepner says, this should be read -r to avoid unintended side effects (evaluating backslash escape sequences).
  • Charles Duffy
    Charles Duffy almost 11 years
    Unsafe -- you've prevented string-splitting, but you haven't prevented glob expansion. If a line contains *, that will be expanded to a list of names in the current directory during the echo.
  • Classified
    Classified almost 11 years
    thx for your help and reply. I'm a little confused here with bash/*nix. I didn't change IFS. It's set as a newline by default. I checked it with echo "IFS = $IFS word test" and the string "word test" got printed to the following line so we know it's \n by default. In any case, using the default IFS, it breaks my line at the comma even though IFS=\n. When I do as you suggested above, by setting the IFS explicitly to \n, then it prints my whole line without breaking over the comma. Any idea why it works when it's explicitly set as \n and not work when by default IFS is already \n? Thanks again.
  • Jonathan Leffler
    Jonathan Leffler almost 11 years
    The default value of IFS is (using a piece of bash-speak) $' \t\n'; that is, it consists of blank, tab, newline. This probably alters your analysis. When you say 'breaks at the comma', you mean it breaks at the space after the comma, I believe, which is consistent with IFS containing blank (and tab and newline).