Looping through the content of a file in Bash

2,245,956

Solution 1

One way to do it is:

while read p; do
  echo "$p"
done <peptides.txt

As pointed out in the comments, this has the side effects of trimming leading whitespace, interpreting backslash sequences, and skipping the last line if it's missing a terminating linefeed. If these are concerns, you can do:

while IFS="" read -r p || [ -n "$p" ]
do
  printf '%s\n' "$p"
done < peptides.txt

Exceptionally, if the loop body may read from standard input, you can open the file using a different file descriptor:

while read -u 10 p; do
  ...
done 10<peptides.txt

Here, 10 is just an arbitrary number (different from 0, 1, 2).

Solution 2

cat peptides.txt | while read line 
do
   # do something with $line here
done

and the one-liner variant:

cat peptides.txt | while read line; do something_with_$line_here; done

These options will skip the last line of the file if there is no trailing line feed.

You can avoid this by the following:

cat peptides.txt | while read line || [[ -n $line ]];
do
   # do something with $line here
done

Solution 3

Option 1a: While loop: Single line at a time: Input redirection

#!/bin/bash
filename='peptides.txt'
echo Start
while read p; do 
    echo "$p"
done < "$filename"

Option 1b: While loop: Single line at a time:
Open the file, read from a file descriptor (in this case file descriptor #4).

#!/bin/bash
filename='peptides.txt'
exec 4<"$filename"
echo Start
while read -u4 p ; do
    echo "$p"
done

Solution 4

This is no better than other answers, but is one more way to get the job done in a file without spaces (see comments). I find that I often need one-liners to dig through lists in text files without the extra step of using separate script files.

for word in $(cat peptides.txt); do echo $word; done

This format allows me to put it all in one command-line. Change the "echo $word" portion to whatever you want and you can issue multiple commands separated by semicolons. The following example uses the file's contents as arguments into two other scripts you may have written.

for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done

Or if you intend to use this like a stream editor (learn sed) you can dump the output to another file as follows.

for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done > outfile.txt

I've used these as written above because I have used text files where I've created them with one word per line. (See comments) If you have spaces that you don't want splitting your words/lines, it gets a little uglier, but the same command still works as follows:

OLDIFS=$IFS; IFS=$'\n'; for line in $(cat peptides.txt); do cmd_a.sh $line; cmd_b.py $line; done > outfile.txt; IFS=$OLDIFS

This just tells the shell to split on newlines only, not spaces, then returns the environment back to what it was previously. At this point, you may want to consider putting it all into a shell script rather than squeezing it all into a single line, though.

Best of luck!

Solution 5

A few more things not covered by other answers:

Reading from a delimited file

# ':' is the delimiter here, and there are three fields on each line in the file
# IFS set below is restricted to the context of `read`, it doesn't affect any other code
while IFS=: read -r field1 field2 field3; do
  # process the fields
  # if the line has less than three fields, the missing fields will be set to an empty string
  # if the line has more than three fields, `field3` will get all the values, including the third field plus the delimiter(s)
done < input.txt

Reading from the output of another command, using process substitution

while read -r line; do
  # process the line
done < <(command ...)

This approach is better than command ... | while read -r line; do ... because the while loop here runs in the current shell rather than a subshell as in the case of the latter. See the related post A variable modified inside a while loop is not remembered.

Reading from a null delimited input, for example find ... -print0

while read -r -d '' line; do
  # logic
  # use a second 'read ... <<< "$line"' if we need to tokenize the line
done < <(find /path/to/dir -print0)

Related read: BashFAQ/020 - How can I find and safely handle file names containing newlines, spaces or both?

Reading from more than one file at a time

while read -u 3 -r line1 && read -u 4 -r line2; do
  # process the lines
  # note that the loop will end when we reach EOF on either of the files, because of the `&&`
done 3< input1.txt 4< input2.txt

Based on @chepner's answer here:

-u is a bash extension. For POSIX compatibility, each call would look something like read -r X <&3.

Reading a whole file into an array (Bash versions earlier to 4)

while read -r line; do
    my_array+=("$line")
done < my_file

If the file ends with an incomplete line (newline missing at the end), then:

while read -r line || [[ $line ]]; do
    my_array+=("$line")
done < my_file

Reading a whole file into an array (Bash versions 4x and later)

readarray -t my_array < my_file

or

mapfile -t my_array < my_file

And then

for line in "${my_array[@]}"; do
  # process the lines
done

Related posts:

Share:
2,245,956
Peter Mortensen
Author by

Peter Mortensen

Experienced application developer. Software Engineer. M.Sc.E.E. C++ (10 years), software engineering, .NET/C#/VB.NET (12 years), usability testing, Perl, scientific computing, Python, Windows/Macintosh/Linux, Z80 assembly, CAN bus/CANopen. Contact I can be contacted through this reCAPTCHA (requires JavaScript to be allowed from google.com and possibly other(s)). Make sure to make the subject specific (I said: specific. Repeat: specific subject required). I can not stress this enough - 90% of you can not compose a specific subject, but instead use some generic subject. Use a specific subject, damn it! You still don't get it. It can't be that difficult to provide a specific subject to an email instead of a generic one. For example, including meta content like "quick question" is unhelpful. Concentrate on the actual subject. Did I say specific? I think I did. Let me repeat it just in case: use a specific subject in your email (otherwise it will no be opened at all). Selected questions, etc.: End-of-line identifier in VB.NET? How can I determine if a .NET assembly was built for x86 or x64? C++ reference - sample memmove The difference between + and &amp; for joining strings in VB.NET Some of my other accounts: Careers. [/]. Super User (SU). [/]. Other My 15 minutes of fame on Super User My 15 minutes of fame in Denmark Blog. Sample: Jump the shark. LinkedIn @PeterMortensen (Twitter) Quora GitHub Full jump page (Last updated 2021-11-25)

Updated on July 08, 2022

Comments

  • Peter Mortensen
    Peter Mortensen almost 2 years

    How do I iterate through each line of a text file with Bash?

    With this script:

    echo "Start!"
    for p in (peptides.txt)
    do
        echo "${p}"
    done
    

    I get this output on the screen:

    Start!
    ./runPep.sh: line 3: syntax error near unexpected token `('
    ./runPep.sh: line 3: `for p in (peptides.txt)'
    

    (Later I want to do something more complicated with $p than just output to the screen.)


    The environment variable SHELL is (from env):

    SHELL=/bin/bash
    

    /bin/bash --version output:

    GNU bash, version 3.1.17(1)-release (x86_64-suse-linux-gnu)
    Copyright (C) 2005 Free Software Foundation, Inc.
    

    cat /proc/version output:

    Linux version 2.6.18.2-34-default (geeko@buildhost) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006
    

    The file peptides.txt contains:

    RKEKNVQ
    IPKKLLQK
    QYFHQLEKMNVK
    IPKKLLQK
    GDLSTALEVAIDCYEK
    QYFHQLEKMNVKIPENIYR
    RKEKNVQ
    VLAKHGKLQDAIN
    ILGFMK
    LEDVALQILL
    
  • JesperE
    JesperE over 14 years
    In general, if you're using "cat" with only one argument, you're doing something wrong (or suboptimal).
  • Warren Young
    Warren Young over 14 years
    Yes, it's just not as efficient as Bruno's, because it launches another program, unnecessarily. If efficiency matters, do it Bruno's way. I remember my way because you can use it with other commands, where the "redirect in from" syntax doesn't work.
  • Peter Mortensen
    Peter Mortensen over 14 years
    How should I interpret the last line? File peptides.txt is redirected to standard input and somehow to the whole of the while block?
  • Warren Young
    Warren Young over 14 years
    "Slurp peptides.txt into this while loop, so the 'read' command has something to consume." My "cat" method is similar, sending the output of a command into the while block for consumption by 'read', too, only it launches another program to get the work done.
  • Peter Mortensen
    Peter Mortensen over 14 years
    For option 1b: does the file descriptor need to be closed again? E.g. the loop could be an inner loop.
  • Stan Graves
    Stan Graves over 14 years
    The file descriptor will be cleaned up with the process exits. An explicit close can be done to reuse the fd number. To close a fd, use another exec with the &- syntax, like this: exec 4<&-
  • Gordon Davisson
    Gordon Davisson over 14 years
    There's another, more serious problem with this: because the while loop is part of a pipeline, it runs in a subshell, and hence any variables set inside the loop are lost when it exits (see bash-hackers.org/wiki/doku.php/mirroring/bashfaq/024). This can be very annoying (depending on what you're trying to do in the loop).
  • Ogre Psalm33
    Ogre Psalm33 over 12 years
    @JesperE would you care to elaborate with an alternative example?
  • Warren Young
    Warren Young over 12 years
    @Ogre: He means you should be doing it like Bruno did in his accepted answer. Both work. Bruno's way is just a bit more efficient, since it doesn't run an external command to do the file reading bit. If the efficiency matters, do it Bruno's way. If not, then do it whatever way makes the most sense to you.
  • JesperE
    JesperE over 12 years
    @OgrePsalm33: Warren is right. The "cat" command is used for concatenating files. If you are not concatenating files, chances are that you don't need to use "cat".
  • Ogre Psalm33
    Ogre Psalm33 over 12 years
    Ok, makes sense. I wanted to make a point of it because I see a lot of overused examples in scripts and such, where "cat" simply serves as an extra step to get the contents of a single file.
  • Karl Katzke
    Karl Katzke almost 11 years
    This didn't work for me. The second ranked answer, which used cat and a pipe, did work for me.
  • Joao Costa
    Joao Costa over 10 years
    This doesn't meet the requirement (iterate through each line) if the file contains spaces or tabs, but can be useful if you want to iterate through each field in a tab/space separated file.
  • xastor
    xastor over 10 years
    This method seems to skip the last line of a file.
  • maxpolk
    maxpolk over 10 years
    The bash $(<peptides.txt) is perhaps more elegant, but it's still wrong, what Joao said correct, you are performing command substitution logic where space or newline is the same thing. If a line has a space in it, the loop executes TWICE or more for that one line. So your code should properly read: for word in $(<peptides.txt); do .... If you know for a fact there are no spaces, then a line equals a word and you're okay.
  • mightypile
    mightypile over 10 years
    @JoaoCosta,maxpolk : Good points that I hadn't considered. I've edited the original post to reflect them. Thanks!
  • mklement0
    mklement0 over 10 years
    Using for makes the input tokens/lines subject to shell expansions, which is usually undesirable; try this: for l in $(echo '* b c'); do echo "[$l]"; done - as you'll see, the * - even though originally a quoted literal - expands to the files in the current directory.
  • Dss
    Dss over 10 years
    Can this be done in reverse starting at the bottom of the file?
  • Bruno De Fraine
    Bruno De Fraine over 10 years
    @Dss Then I would use a solution based on cat but replace cat by tac.
  • Dss
    Dss over 10 years
    @BrunoDeFraine I've tried that but tac seems to make each space a new line. I need the full line delimited by the newline char. maybe I'm doing it wrong.
  • Dss
    Dss over 10 years
    @BrunoDeFraine Ok I found this: unix.stackexchange.com/a/7012 ..change cat to tac and it works. Thanks!
  • Bruno De Fraine
    Bruno De Fraine over 10 years
    @Dss I meant the solution from Warren Young stackoverflow.com/a/1521470/6918 ; just replace cat by tac and you should read the lines in reverse.
  • mat kelcey
    mat kelcey about 10 years
    I use "cat file | " as the start of a lot of my commands purely because I often prototype with "head file |"
  • masgo
    masgo almost 10 years
    Thank you for Option 2. I ran into huge problems with Option 1 because I needed to read from stdin within the loop; in such a case Option 1 will not work.
  • ACK_stoverflow
    ACK_stoverflow almost 10 years
    @matkelcey Also, how else would you put an entire file into the front of a pipeline? Bash gives you here strings, which are awesome (especially for things like if grep -q 'findme' <<< "$var") but not portable, and I wouldn't want to start a large pipeline with one. Something like cat ifconfig.output | grep inet[^6] | grep -v '127.0.0.1' | awk '{print $2}' | cut -d':' -f2 is easier to read, since everything follows from left to right. It's like strtoking with awk instead of cut because you don't want empty tokens - it's sort of an abuse of the command, but that's just how it's done.
  • Mike Q
    Mike Q over 9 years
    Double quote the lines !! echo "$p" and the file.. trust me it will bite you if you don't!!! I KNOW! lol
  • Savage Reader
    Savage Reader over 9 years
    This may be not that efficient, but it's much more readable than other answers.
  • Toby Speight
    Toby Speight almost 9 years
    This answer needs the caveats mentioned in mightypile's answer, and it can fail badly if any line contains shell metacharacters (due to the unquoted "$x").
  • Jahid
    Jahid almost 9 years
    @DavidC.Rankin The -r option prevents backslash interpretation. Note #2 is a link where it is described in detail...
  • tishma
    tishma over 8 years
    +1 for readability, and also modularity - this code can easily be put into a more complex pipeline by replacing 'cat ...' with output of something else.
  • dblanchard
    dblanchard over 8 years
    Joao and maxpolk, you are addressing the issue I'm having, but I'm still getting a separate iteration for each half of each line with a space: > cat linkedin_OSInt.txt linkedin.com/vsearch/f?type=all&keywords="foo bar" linkedin.com/vsearch/f?type=all&keywords="baz bux" > for url in $(<linkedin_OSInt.txt); do echo "$url"; done linkedin.com/vsearch/f?type=all&keywords="foo bar" linkedin.com/vsearch/f?type=all&keywords="baz bux" I'll try the other approaches here, but would like understand why this one doesn't work.
  • mightypile
    mightypile over 8 years
    @dblanchard: The last example, using $IFS, should ignore spaces. Have you tried that version?
  • Znik
    Znik over 8 years
    It is much better resolve than Bruno has written. It is specially usefull when data is created dynamically by command. Using Bruno's solution, loop will receive any data after command will completly done. Your solution gives command result on line into loop, without taking buffer from system. for example replace 'cat peptides.txt' by 'find /', or in previous solution 'done <peptides.txt' by 'done < $(find /)' . it can fail execution because there is a chance for overflow buffer or consume all memory.
  • Znik
    Znik over 8 years
    please notice for changing cat peptides.txt by find / . for loop will not start before internal cat finishes. between these steps it is possible buffer overflow.
  • fedorqui
    fedorqui almost 8 years
  • Jose Antonio Alvarez Ruiz
    Jose Antonio Alvarez Ruiz over 7 years
    That -u option made my day ;) Thanks!
  • dawg
    dawg over 7 years
    Both versions fail to read a final line if it is not terminated with a newline. Always use while read p || [[ -n $p ]]; do ...
  • codeforester
    codeforester over 7 years
    This answer is defeating all the principles set by the good answers above!
  • Veda
    Veda over 7 years
    This does not work for lines that end with a backslash "\". Lines ending with a backslash will be prepended to the next line (and the \ will be removed).
  • Florin Andrei
    Florin Andrei about 7 years
    Combine this with the "read -u" option in another answer and then it's perfect.
  • Jahid
    Jahid about 7 years
    @FlorinAndrei : The above example doesn't need the -u option, are you talking about another example with -u?
  • dawg
    dawg about 7 years
    Please delete this answer.
  • Egor Hans
    Egor Hans over 6 years
    Now guys, don't exaggerate. The answer is bad, but it seems to work, at least for simple use cases. As long as that's provided, being a bad answer doesn't take away the answer's right to exist.
  • Egor Hans
    Egor Hans over 6 years
    While the answer is correct, I do understand how it ended up down here. The essential method is the same as proposed by many other answers. Plus, it completely drowns in your FPS example.
  • Egor Hans
    Egor Hans over 6 years
    I'm actually surprised people didn't yet come up with the usual Don't read lines with for...
  • Egor Hans
    Egor Hans over 6 years
    The way how this command gets a lot more complex as crucial issues are fixed, presents very well why using for to iterate file lines is a a bad idea. Plus, the expansion aspect mentioned by @mklement0 (even though that probably can be circumvented by bringing in escaped quotes, which again makes things more complex and less readable).
  • Egor Hans
    Egor Hans over 6 years
    @Veda Now that's weird. What I would expect is, you get an extra n after the backslash, and the lines get concatenated. Because that would mean, the backslash escapes the backslash of \n, causing it to be interpreted literally rather than as a newline. But the fact that the backslash disappears, as well as the newline, means it's consumed for some kind of escaping like expected, but gets merged with the original newline character into something that isn't printed... Do you have a tool that displays unprinted characters in some way? Would interest me what that results in.
  • Egor Hans
    Egor Hans over 6 years
    You should point out more clearly that Option 2 is strongly discouraged. @masgo Option 1b should work in that case, and can be combined with the input redirection syntax from Option 1a by replacing done < $filename with done 4<$filename (which is useful if you want to read the file name from a command parameter, in which case you can just replace $filename by $1).
  • Egor Hans
    Egor Hans over 6 years
    Looked through your links, and was surprised there's no answer that simply links your link in Note 2. That page provides everything you need to know about that subject. Or are link-only answers discouraged or something?
  • Jahid
    Jahid over 6 years
    @EgorHans : link only answers are generally deleted.
  • Veda
    Veda over 6 years
    @EgorHans the \ escapes the "\n" character which is a single character. Google for an "ascii table". Character 10 is \n and character 13 is \r. Linux "xxd" tool will show you the characters. A file with a\na\n\\n will look like: 610a 610a 5c0a (0a is hex for 10, so \n). So the last case the "5c" character or the "\" is escaping a single character.
  • Egor Hans
    Egor Hans over 6 years
    @Veda Ah OK, now I understand better. Didn't realize the file content gets dumped into the execution flow the way it's inside the file, where of course \n is one single character. For some reason I've been thinking it gets backresolved to the control sequence while processed. Still, it's somewhat weird that an escaped \n is something without a printed representation. One would expect it to resolve to the char sequence "\n" when escaped.
  • Egor Hans
    Egor Hans over 6 years
    Ah. Alright, never suggesting a link-only answer again. Maybe there even were some, we'll never know.
  • Ryan
    Ryan about 6 years
    By the time you care about the difference in performance you won't be asking SO these sorts of questions.
  • David Tabernero M.
    David Tabernero M. almost 6 years
    This is, in a readable way, the only answer that also reads the latest line of a file, which is a pro.
  • Cory Ringdahl
    Cory Ringdahl over 5 years
    This is, however, great for grep, sed, or any other text manipulation prepending the read.
  • Charles Duffy
    Charles Duffy over 5 years
    @EgorHans, I disagree strongly: The point of answers is to teach people how to write software. Teaching people to do things in a way that you know is harmful to them and the people who use their software (introducing bugs / unexpected behaviors / etc) is knowingly harming others. An answer known to be harmful has no "right to exist" in a well-curated teaching resource (and curating it is exactly what we, the folks who are voting and flagging, are supposed to be doing here).
  • Charles Duffy
    Charles Duffy over 5 years
    @EgorHans, ...incidentally, the worst data-loss incident I've been personally witness to was caused by ops staff doing something that "seemed to work" in a script (using an unquoted expansion for a filename to be deleted -- when that name was supposed to be able to contain only hex digits). Except a bug in a different piece of software wrote a name with random contents, which had a whitespace-surrounded *, and a massive trove of billing-data backups was lost.
  • Shmiggy
    Shmiggy over 5 years
    Your soul be blessed for that different file descriptor command, made me happy, wasted 8 days on an error generated by standard input replacement. +1
  • user5359531
    user5359531 over 5 years
    this does not work if any of the commands inside your loop run commands via ssh; the stdin stream gets consumed (even if ssh is not using it), and the loop terminates after the first iteration.
  • user5359531
    user5359531 over 5 years
    I need to loop over file contents such as tail -n +2 myfile.txt | grep 'somepattern' | cut -f3, while running ssh commands inside the loop (consumes stdin); option 2 here appears to be the only way?
  • tripleee
    tripleee over 5 years
    As in the accepted answer, this will have unpleasant surprises without read -r in some corner cases. Basically always use read -r unless you specifically require the quirky behavior of plain legacy read.
  • masterxilo
    masterxilo about 5 years
    note that instead of command < input_filename.txt you can always do input_generating_command | command or command < <(input_generating_command)
  • januarvs
    januarvs about 5 years
    It skips the last line. So as workaround, must add empty line at the last.
  • Warren Young
    Warren Young about 5 years
    @januarvs: It only does that if the last line of your file has no LF terminator, which will cause lots of other things to fail, too.
  • Alexander Mills
    Alexander Mills almost 5 years
    can you please mention what the -r flag does?
  • Bruno De Fraine
    Bruno De Fraine almost 5 years
    @AlexanderMills -r disables the interpretation of backslashes as escape sequences. The empty IFS disables that read splits up the line in fields. And because read fails when it encounters end-of-file before the line ends, we also test for a non-empty line.
  • Vladimir Sh.
    Vladimir Sh. almost 5 years
    Option 2 seems to be the best when some actions should be performed on remote host via SSH.
  • Vladimir Sh.
    Vladimir Sh. almost 5 years
    This will work in some cases, but proposed solution is far from good.
  • void.pointer
    void.pointer almost 5 years
    read is ignoring the \r character when the file uses windows line endings (i.e. \r\n). How can I make read treat \r as part of the newline sequence?
  • Kramer
    Kramer over 4 years
    If you are like me and you do these things in the shell in a single line, you need to add another semi-colon like so: while read p; do echo "$p"; done <peptides.txt
  • tripleee
    tripleee about 4 years
    Don't do this. Looping over line numbers and fetching each individual line by way of sed or head + tail is incredibly inefficient, and of course begs the question why you don't simply use one of the other solutions here. If you need to know the line number, add a counter to your while read -r loop, or use nl -ba to add a line number prefix to each line before the loop.
  • frank_108
    frank_108 about 4 years
    Thanks for reading file into array. Exactly what I need, because I need each line to parse twice, add to new variables, do some validations etc.
  • user5359531
    user5359531 almost 4 years
    this is by far the most useful version I think
  • All The Rage
    All The Rage over 3 years
    This answer is exactly what I need. The other answers that use while to iterate over lines or ok when one wants to deal with lines of text. But I'd use Python for that. In a shell I often have files containing hundreds of items that will be treated as tokens. I need the removal of newlines, which this syntax provides. Thanks for the answer.
  • Virtimus
    Virtimus over 3 years
    1a for me is fine as soon as One add "echo $p" at end (last line from file)
  • vatosarmat
    vatosarmat almost 3 years
    No need in OLDIFS if use subshell-parentheses () around IFS=$'\n'; for ...; done
  • ingyhere
    ingyhere almost 3 years
    This really doesn't work in any general way. Bash splits each line on spaces which is very unlikely a desired outcome.
  • tripleee
    tripleee over 2 years
  • madD7
    madD7 over 2 years
    @tripleee i have clearly mentioned "this may not be the best way". I have not limited the discussion to "the best or the most efficient solution".
  • Matt
    Matt about 2 years
    The question asks specifically for how to do it with bash
  • Chris
    Chris about 2 years
    @Matt I am interpreting the intent here as a "how do I do it in bash" rather than "how do I do it with bash". And I've been frustrated enough with overly literal interpretations of my questions that I'm happy to wait for the OP to weigh in.
  • Erwann
    Erwann about 2 years
    'read -r -d ''` works for null delimited input in combination with while, not standalone (read -r d '' foo bar). See here.
  • scandel
    scandel about 2 years
    Iterating over the lines of a file with a for loop can be useful in some situations. For example some commands can make a while loop break. See stackoverflow.com/a/64049584/2761700
  • Charles Duffy
    Charles Duffy about 2 years
    There are serious security problems with this approach. What if your peptides.txt contains something that unescapes to $(rm -rf ~), or even worse, $(rm -rf ~)'$(rm -rf ~)'?
  • Nino DELCEY
    Nino DELCEY almost 2 years
    @GordonDavisson Your link is broken, the new one is : mywiki.wooledge.org/BashFAQ/024
  • Ed Morton
    Ed Morton almost 2 years