Looping through the content of a file in Bash
Solution 1
One way to do it is:
while read p; do
echo "$p"
done <peptides.txt
As pointed out in the comments, this has the side effects of trimming leading whitespace, interpreting backslash sequences, and skipping the last line if it's missing a terminating linefeed. If these are concerns, you can do:
while IFS="" read -r p || [ -n "$p" ]
do
printf '%s\n' "$p"
done < peptides.txt
Exceptionally, if the loop body may read from standard input, you can open the file using a different file descriptor:
while read -u 10 p; do
...
done 10<peptides.txt
Here, 10 is just an arbitrary number (different from 0, 1, 2).
Solution 2
cat peptides.txt | while read line
do
# do something with $line here
done
and the one-liner variant:
cat peptides.txt | while read line; do something_with_$line_here; done
These options will skip the last line of the file if there is no trailing line feed.
You can avoid this by the following:
cat peptides.txt | while read line || [[ -n $line ]];
do
# do something with $line here
done
Solution 3
Option 1a: While loop: Single line at a time: Input redirection
#!/bin/bash
filename='peptides.txt'
echo Start
while read p; do
echo "$p"
done < "$filename"
Option 1b: While loop: Single line at a time:
Open the file, read from a file descriptor (in this case file descriptor #4).
#!/bin/bash
filename='peptides.txt'
exec 4<"$filename"
echo Start
while read -u4 p ; do
echo "$p"
done
Solution 4
This is no better than other answers, but is one more way to get the job done in a file without spaces (see comments). I find that I often need one-liners to dig through lists in text files without the extra step of using separate script files.
for word in $(cat peptides.txt); do echo $word; done
This format allows me to put it all in one command-line. Change the "echo $word" portion to whatever you want and you can issue multiple commands separated by semicolons. The following example uses the file's contents as arguments into two other scripts you may have written.
for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done
Or if you intend to use this like a stream editor (learn sed) you can dump the output to another file as follows.
for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done > outfile.txt
I've used these as written above because I have used text files where I've created them with one word per line. (See comments) If you have spaces that you don't want splitting your words/lines, it gets a little uglier, but the same command still works as follows:
OLDIFS=$IFS; IFS=$'\n'; for line in $(cat peptides.txt); do cmd_a.sh $line; cmd_b.py $line; done > outfile.txt; IFS=$OLDIFS
This just tells the shell to split on newlines only, not spaces, then returns the environment back to what it was previously. At this point, you may want to consider putting it all into a shell script rather than squeezing it all into a single line, though.
Best of luck!
Solution 5
A few more things not covered by other answers:
Reading from a delimited file
# ':' is the delimiter here, and there are three fields on each line in the file
# IFS set below is restricted to the context of `read`, it doesn't affect any other code
while IFS=: read -r field1 field2 field3; do
# process the fields
# if the line has less than three fields, the missing fields will be set to an empty string
# if the line has more than three fields, `field3` will get all the values, including the third field plus the delimiter(s)
done < input.txt
Reading from the output of another command, using process substitution
while read -r line; do
# process the line
done < <(command ...)
This approach is better than command ... | while read -r line; do ...
because the while loop here runs in the current shell rather than a subshell as in the case of the latter. See the related post A variable modified inside a while loop is not remembered.
Reading from a null delimited input, for example find ... -print0
while read -r -d '' line; do
# logic
# use a second 'read ... <<< "$line"' if we need to tokenize the line
done < <(find /path/to/dir -print0)
Related read: BashFAQ/020 - How can I find and safely handle file names containing newlines, spaces or both?
Reading from more than one file at a time
while read -u 3 -r line1 && read -u 4 -r line2; do
# process the lines
# note that the loop will end when we reach EOF on either of the files, because of the `&&`
done 3< input1.txt 4< input2.txt
Based on @chepner's answer here:
-u
is a bash extension. For POSIX compatibility, each call would look something like read -r X <&3
.
Reading a whole file into an array (Bash versions earlier to 4)
while read -r line; do
my_array+=("$line")
done < my_file
If the file ends with an incomplete line (newline missing at the end), then:
while read -r line || [[ $line ]]; do
my_array+=("$line")
done < my_file
Reading a whole file into an array (Bash versions 4x and later)
readarray -t my_array < my_file
or
mapfile -t my_array < my_file
And then
for line in "${my_array[@]}"; do
# process the lines
done
More about the shell builtins
read
andreadarray
commands - GNU- BashFAQ/001 - How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
Related posts:
Peter Mortensen
Experienced application developer. Software Engineer. M.Sc.E.E. C++ (10 years), software engineering, .NET/C#/VB.NET (12 years), usability testing, Perl, scientific computing, Python, Windows/Macintosh/Linux, Z80 assembly, CAN bus/CANopen. Contact I can be contacted through this reCAPTCHA (requires JavaScript to be allowed from google.com and possibly other(s)). Make sure to make the subject specific (I said: specific. Repeat: specific subject required). I can not stress this enough - 90% of you can not compose a specific subject, but instead use some generic subject. Use a specific subject, damn it! You still don't get it. It can't be that difficult to provide a specific subject to an email instead of a generic one. For example, including meta content like "quick question" is unhelpful. Concentrate on the actual subject. Did I say specific? I think I did. Let me repeat it just in case: use a specific subject in your email (otherwise it will no be opened at all). Selected questions, etc.: End-of-line identifier in VB.NET? How can I determine if a .NET assembly was built for x86 or x64? C++ reference - sample memmove The difference between + and & for joining strings in VB.NET Some of my other accounts: Careers. [/]. Super User (SU). [/]. Other My 15 minutes of fame on Super User My 15 minutes of fame in Denmark Blog. Sample: Jump the shark. LinkedIn @PeterMortensen (Twitter) Quora GitHub Full jump page (Last updated 2021-11-25)
Updated on July 08, 2022Comments
-
Peter Mortensen almost 2 years
How do I iterate through each line of a text file with Bash?
With this script:
echo "Start!" for p in (peptides.txt) do echo "${p}" done
I get this output on the screen:
Start! ./runPep.sh: line 3: syntax error near unexpected token `(' ./runPep.sh: line 3: `for p in (peptides.txt)'
(Later I want to do something more complicated with
$p
than just output to the screen.)
The environment variable SHELL is (from env):
SHELL=/bin/bash
/bin/bash --version
output:GNU bash, version 3.1.17(1)-release (x86_64-suse-linux-gnu) Copyright (C) 2005 Free Software Foundation, Inc.
cat /proc/version
output:Linux version 2.6.18.2-34-default (geeko@buildhost) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006
The file peptides.txt contains:
RKEKNVQ IPKKLLQK QYFHQLEKMNVK IPKKLLQK GDLSTALEVAIDCYEK QYFHQLEKMNVKIPENIYR RKEKNVQ VLAKHGKLQDAIN ILGFMK LEDVALQILL
-
fedorqui over 7 yearsOh, I see many things have happened here: all the comments were deleted and the question being reopened. Just for reference, the accepted answer in Read a file line by line assigning the value to a variable addresses the problem in a canonical way and should be preferred over the accepted one here.
-
Peyman Mohamadpour about 3 yearsfor
$IFS
see What is the exact meaning ofIFS=$'\n'
-
Chris about 2 yearsdon't use bash use
awk
gnu.org/software/gawk/manual/gawk.html
-
-
JesperE over 14 yearsIn general, if you're using "cat" with only one argument, you're doing something wrong (or suboptimal).
-
Warren Young over 14 yearsYes, it's just not as efficient as Bruno's, because it launches another program, unnecessarily. If efficiency matters, do it Bruno's way. I remember my way because you can use it with other commands, where the "redirect in from" syntax doesn't work.
-
Peter Mortensen over 14 yearsHow should I interpret the last line? File peptides.txt is redirected to standard input and somehow to the whole of the while block?
-
Warren Young over 14 years"Slurp peptides.txt into this while loop, so the 'read' command has something to consume." My "cat" method is similar, sending the output of a command into the while block for consumption by 'read', too, only it launches another program to get the work done.
-
Peter Mortensen over 14 yearsFor option 1b: does the file descriptor need to be closed again? E.g. the loop could be an inner loop.
-
Stan Graves over 14 yearsThe file descriptor will be cleaned up with the process exits. An explicit close can be done to reuse the fd number. To close a fd, use another exec with the &- syntax, like this: exec 4<&-
-
Gordon Davisson over 14 yearsThere's another, more serious problem with this: because the while loop is part of a pipeline, it runs in a subshell, and hence any variables set inside the loop are lost when it exits (see bash-hackers.org/wiki/doku.php/mirroring/bashfaq/024). This can be very annoying (depending on what you're trying to do in the loop).
-
Ogre Psalm33 over 12 years@JesperE would you care to elaborate with an alternative example?
-
Warren Young over 12 years@Ogre: He means you should be doing it like Bruno did in his accepted answer. Both work. Bruno's way is just a bit more efficient, since it doesn't run an external command to do the file reading bit. If the efficiency matters, do it Bruno's way. If not, then do it whatever way makes the most sense to you.
-
JesperE over 12 years@OgrePsalm33: Warren is right. The "cat" command is used for concatenating files. If you are not concatenating files, chances are that you don't need to use "cat".
-
Ogre Psalm33 over 12 yearsOk, makes sense. I wanted to make a point of it because I see a lot of overused examples in scripts and such, where "cat" simply serves as an extra step to get the contents of a single file.
-
Karl Katzke almost 11 yearsThis didn't work for me. The second ranked answer, which used cat and a pipe, did work for me.
-
Joao Costa over 10 yearsThis doesn't meet the requirement (iterate through each line) if the file contains spaces or tabs, but can be useful if you want to iterate through each field in a tab/space separated file.
-
xastor over 10 yearsThis method seems to skip the last line of a file.
-
maxpolk over 10 yearsThe bash $(<peptides.txt) is perhaps more elegant, but it's still wrong, what Joao said correct, you are performing command substitution logic where space or newline is the same thing. If a line has a space in it, the loop executes TWICE or more for that one line. So your code should properly read: for word in $(<peptides.txt); do .... If you know for a fact there are no spaces, then a line equals a word and you're okay.
-
mightypile over 10 years@JoaoCosta,maxpolk : Good points that I hadn't considered. I've edited the original post to reflect them. Thanks!
-
mklement0 over 10 yearsUsing
for
makes the input tokens/lines subject to shell expansions, which is usually undesirable; try this:for l in $(echo '* b c'); do echo "[$l]"; done
- as you'll see, the*
- even though originally a quoted literal - expands to the files in the current directory. -
Dss over 10 yearsCan this be done in reverse starting at the bottom of the file?
-
Bruno De Fraine over 10 years@Dss Then I would use a solution based on
cat
but replacecat
bytac
. -
Dss over 10 years@BrunoDeFraine I've tried that but tac seems to make each space a new line. I need the full line delimited by the newline char. maybe I'm doing it wrong.
-
Dss over 10 years@BrunoDeFraine Ok I found this: unix.stackexchange.com/a/7012 ..change cat to tac and it works. Thanks!
-
Bruno De Fraine over 10 years@Dss I meant the solution from Warren Young stackoverflow.com/a/1521470/6918 ; just replace cat by tac and you should read the lines in reverse.
-
mat kelcey about 10 yearsI use "cat file | " as the start of a lot of my commands purely because I often prototype with "head file |"
-
masgo almost 10 yearsThank you for Option 2. I ran into huge problems with Option 1 because I needed to read from stdin within the loop; in such a case Option 1 will not work.
-
ACK_stoverflow almost 10 years@matkelcey Also, how else would you put an entire file into the front of a pipeline? Bash gives you here strings, which are awesome (especially for things like
if grep -q 'findme' <<< "$var"
) but not portable, and I wouldn't want to start a large pipeline with one. Something likecat ifconfig.output | grep inet[^6] | grep -v '127.0.0.1' | awk '{print $2}' | cut -d':' -f2
is easier to read, since everything follows from left to right. It's like strtoking withawk
instead ofcut
because you don't want empty tokens - it's sort of an abuse of the command, but that's just how it's done. -
Mike Q over 9 yearsDouble quote the lines !! echo "$p" and the file.. trust me it will bite you if you don't!!! I KNOW! lol
-
Savage Reader over 9 yearsThis may be not that efficient, but it's much more readable than other answers.
-
Toby Speight almost 9 yearsThis answer needs the caveats mentioned in mightypile's answer, and it can fail badly if any line contains shell metacharacters (due to the unquoted "$x").
-
Jahid almost 9 years@DavidC.Rankin The -r option prevents backslash interpretation.
Note #2
is a link where it is described in detail... -
tishma over 8 years+1 for readability, and also modularity - this code can easily be put into a more complex pipeline by replacing 'cat ...' with output of something else.
-
dblanchard over 8 yearsJoao and maxpolk, you are addressing the issue I'm having, but I'm still getting a separate iteration for each half of each line with a space: > cat linkedin_OSInt.txt linkedin.com/vsearch/f?type=all&keywords="foo bar" linkedin.com/vsearch/f?type=all&keywords="baz bux" > for url in $(<linkedin_OSInt.txt); do echo "$url"; done linkedin.com/vsearch/f?type=all&keywords="foo bar" linkedin.com/vsearch/f?type=all&keywords="baz bux" I'll try the other approaches here, but would like understand why this one doesn't work.
-
mightypile over 8 years@dblanchard: The last example, using $IFS, should ignore spaces. Have you tried that version?
-
Znik over 8 yearsIt is much better resolve than Bruno has written. It is specially usefull when data is created dynamically by command. Using Bruno's solution, loop will receive any data after command will completly done. Your solution gives command result on line into loop, without taking buffer from system. for example replace 'cat peptides.txt' by 'find /', or in previous solution 'done <peptides.txt' by 'done < $(find /)' . it can fail execution because there is a chance for overflow buffer or consume all memory.
-
Znik over 8 yearsplease notice for changing cat peptides.txt by find / . for loop will not start before internal cat finishes. between these steps it is possible buffer overflow.
-
fedorqui almost 8 yearsThis is very bad! Why you don't read lines with "for".
-
Jose Antonio Alvarez Ruiz over 7 yearsThat -u option made my day ;) Thanks!
-
dawg over 7 yearsBoth versions fail to read a final line if it is not terminated with a newline. Always use
while read p || [[ -n $p ]]; do ...
-
codeforester over 7 yearsThis answer is defeating all the principles set by the good answers above!
-
Veda over 7 yearsThis does not work for lines that end with a backslash "\". Lines ending with a backslash will be prepended to the next line (and the \ will be removed).
-
Florin Andrei about 7 yearsCombine this with the "read -u" option in another answer and then it's perfect.
-
Jahid about 7 years@FlorinAndrei : The above example doesn't need the
-u
option, are you talking about another example with-u
? -
dawg about 7 yearsPlease delete this answer.
-
Egor Hans over 6 yearsNow guys, don't exaggerate. The answer is bad, but it seems to work, at least for simple use cases. As long as that's provided, being a bad answer doesn't take away the answer's right to exist.
-
Egor Hans over 6 yearsWhile the answer is correct, I do understand how it ended up down here. The essential method is the same as proposed by many other answers. Plus, it completely drowns in your FPS example.
-
Egor Hans over 6 yearsI'm actually surprised people didn't yet come up with the usual Don't read lines with for...
-
Egor Hans over 6 yearsThe way how this command gets a lot more complex as crucial issues are fixed, presents very well why using
for
to iterate file lines is a a bad idea. Plus, the expansion aspect mentioned by @mklement0 (even though that probably can be circumvented by bringing in escaped quotes, which again makes things more complex and less readable). -
Egor Hans over 6 years@Veda Now that's weird. What I would expect is, you get an extra
n
after the backslash, and the lines get concatenated. Because that would mean, the backslash escapes the backslash of\n
, causing it to be interpreted literally rather than as a newline. But the fact that the backslash disappears, as well as the newline, means it's consumed for some kind of escaping like expected, but gets merged with the original newline character into something that isn't printed... Do you have a tool that displays unprinted characters in some way? Would interest me what that results in. -
Egor Hans over 6 yearsYou should point out more clearly that Option 2 is strongly discouraged. @masgo Option 1b should work in that case, and can be combined with the input redirection syntax from Option 1a by replacing
done < $filename
withdone 4<$filename
(which is useful if you want to read the file name from a command parameter, in which case you can just replace$filename
by$1
). -
Egor Hans over 6 yearsLooked through your links, and was surprised there's no answer that simply links your link in Note 2. That page provides everything you need to know about that subject. Or are link-only answers discouraged or something?
-
Jahid over 6 years@EgorHans : link only answers are generally deleted.
-
Veda over 6 years@EgorHans the \ escapes the "\n" character which is a single character. Google for an "ascii table". Character 10 is \n and character 13 is \r. Linux "xxd" tool will show you the characters. A file with
a\na\n\\n
will look like:610a 610a 5c0a
(0a is hex for 10, so \n). So the last case the "5c" character or the "\" is escaping a single character. -
Egor Hans over 6 years@Veda Ah OK, now I understand better. Didn't realize the file content gets dumped into the execution flow the way it's inside the file, where of course
\n
is one single character. For some reason I've been thinking it gets backresolved to the control sequence while processed. Still, it's somewhat weird that an escaped\n
is something without a printed representation. One would expect it to resolve to the char sequence "\n" when escaped. -
Egor Hans over 6 yearsAh. Alright, never suggesting a link-only answer again. Maybe there even were some, we'll never know.
-
Ryan about 6 yearsBy the time you care about the difference in performance you won't be asking SO these sorts of questions.
-
David Tabernero M. almost 6 yearsThis is, in a readable way, the only answer that also reads the latest line of a file, which is a pro.
-
Cory Ringdahl over 5 yearsThis is, however, great for grep, sed, or any other text manipulation prepending the read.
-
Charles Duffy over 5 years@EgorHans, I disagree strongly: The point of answers is to teach people how to write software. Teaching people to do things in a way that you know is harmful to them and the people who use their software (introducing bugs / unexpected behaviors / etc) is knowingly harming others. An answer known to be harmful has no "right to exist" in a well-curated teaching resource (and curating it is exactly what we, the folks who are voting and flagging, are supposed to be doing here).
-
Charles Duffy over 5 years@EgorHans, ...incidentally, the worst data-loss incident I've been personally witness to was caused by ops staff doing something that "seemed to work" in a script (using an unquoted expansion for a filename to be deleted -- when that name was supposed to be able to contain only hex digits). Except a bug in a different piece of software wrote a name with random contents, which had a whitespace-surrounded
*
, and a massive trove of billing-data backups was lost. -
Shmiggy over 5 yearsYour soul be blessed for that different file descriptor command, made me happy, wasted 8 days on an error generated by standard input replacement. +1
-
user5359531 over 5 yearsthis does not work if any of the commands inside your loop run commands via ssh; the stdin stream gets consumed (even if ssh is not using it), and the loop terminates after the first iteration.
-
user5359531 over 5 yearsI need to loop over file contents such as
tail -n +2 myfile.txt | grep 'somepattern' | cut -f3
, while running ssh commands inside the loop (consumes stdin); option 2 here appears to be the only way? -
tripleee over 5 yearsAs in the accepted answer, this will have unpleasant surprises without
read -r
in some corner cases. Basically always useread -r
unless you specifically require the quirky behavior of plain legacyread
. -
masterxilo about 5 yearsnote that instead of
command < input_filename.txt
you can always doinput_generating_command | command
orcommand < <(input_generating_command)
-
januarvs about 5 yearsIt skips the last line. So as workaround, must add empty line at the last.
-
Warren Young about 5 years@januarvs: It only does that if the last line of your file has no LF terminator, which will cause lots of other things to fail, too.
-
Alexander Mills almost 5 yearscan you please mention what the
-r
flag does? -
Bruno De Fraine almost 5 years@AlexanderMills
-r
disables the interpretation of backslashes as escape sequences. The emptyIFS
disables thatread
splits up the line in fields. And becauseread
fails when it encounters end-of-file before the line ends, we also test for a non-empty line. -
Vladimir Sh. almost 5 yearsOption 2 seems to be the best when some actions should be performed on remote host via SSH.
-
Vladimir Sh. almost 5 yearsThis will work in some cases, but proposed solution is far from good.
-
void.pointer almost 5 years
read
is ignoring the\r
character when the file uses windows line endings (i.e.\r\n
). How can I makeread
treat\r
as part of the newline sequence? -
Kramer over 4 yearsIf you are like me and you do these things in the shell in a single line, you need to add another semi-colon like so: while read p; do echo "$p"; done <peptides.txt
-
tripleee about 4 yearsDon't do this. Looping over line numbers and fetching each individual line by way of
sed
orhead
+tail
is incredibly inefficient, and of course begs the question why you don't simply use one of the other solutions here. If you need to know the line number, add a counter to yourwhile read -r
loop, or usenl -ba
to add a line number prefix to each line before the loop. -
frank_108 about 4 yearsThanks for reading file into array. Exactly what I need, because I need each line to parse twice, add to new variables, do some validations etc.
-
user5359531 almost 4 yearsthis is by far the most useful version I think
-
All The Rage over 3 yearsThis answer is exactly what I need. The other answers that use while to iterate over lines or ok when one wants to deal with lines of text. But I'd use Python for that. In a shell I often have files containing hundreds of items that will be treated as tokens. I need the removal of newlines, which this syntax provides. Thanks for the answer.
-
Virtimus over 3 years1a for me is fine as soon as One add "echo $p" at end (last line from file)
-
vatosarmat almost 3 yearsNo need in OLDIFS if use subshell-parentheses () around
IFS=$'\n'; for ...; done
-
ingyhere almost 3 yearsThis really doesn't work in any general way. Bash splits each line on spaces which is very unlikely a desired outcome.
-
tripleee over 2 yearsSee also now stackoverflow.com/questions/65538947/…
-
madD7 over 2 years@tripleee i have clearly mentioned "this may not be the best way". I have not limited the discussion to "the best or the most efficient solution".
-
Matt about 2 yearsThe question asks specifically for how to do it with bash
-
Chris about 2 years@Matt I am interpreting the intent here as a "how do I do it in bash" rather than "how do I do it with bash". And I've been frustrated enough with overly literal interpretations of my questions that I'm happy to wait for the OP to weigh in.
-
Erwann about 2 years'read -r -d ''` works for null delimited input in combination with
while
, not standalone (read -r d '' foo bar
). See here. -
scandel about 2 yearsIterating over the lines of a file with a for loop can be useful in some situations. For example some commands can make a while loop break. See stackoverflow.com/a/64049584/2761700
-
Charles Duffy about 2 yearsThere are serious security problems with this approach. What if your
peptides.txt
contains something that unescapes to$(rm -rf ~)
, or even worse,$(rm -rf ~)'$(rm -rf ~)'
? -
Nino DELCEY almost 2 years@GordonDavisson Your link is broken, the new one is : mywiki.wooledge.org/BashFAQ/024
-
Ed Morton almost 2 yearsThis is the right answer, see why-is-using-a-shell-loop-to-process-text-considered-bad-practice