Read a file line by line and if condition is met continue reading till next condition

bash read shell-script text-processing

106,449

Solution 1

You need to make some changes to your script (in no particular order):

Use IFS= before read to avoid removing leading and trailing spaces.
As $line is not changed anywhere, there is no need for variable readLine.
Do not use read in the middle of the loop!!.
Use a Boolean variable to control printing.
Make clear the start and end of printing.

With those changes, the script becomes:

#!/bin/bash

filename="foo.txt"

#While loop to read line by line
while IFS= read -r line; do
    #If the line starts with ST then set var to yes.
    if [[ $line == qwe* ]] ; then
        printline="yes"
        # Just t make each line start very clear, remove in use.
        echo "----------------------->>"
    fi
    # If variable is yes, print the line.
    if [[ $printline == "yes" ]] ; then
        echo "$line"
    fi
    #If the line starts with ST then set var to no.
    if [[ $line == ewq* ]] ; then
        printline="no"
        # Just to make each line end very clear, remove in use.
        echo "----------------------------<<"
    fi
done < "$filename"

Which could be condensed in this way:

#!/bin/bash
filename="foo.txt"
while IFS= read -r line; do
    [[ $line == qwe* ]]       && printline="yes"
    [[ $printline == "yes" ]] && echo "$line"
    [[ $line == ewq* ]]       && printline="no"
done < "$filename"

That will print the start and end lines (inclusive).
If there is no need to print them, swap the start and end tests:

#!/bin/bash
filename="foo.txt"
while IFS= read -r line; do
    [[ $line == ewq* ]]       && printline="no"
    [[ $printline == "yes" ]] && echo "$line"
    [[ $line == qwe* ]]       && printline="yes"
done < "$filename"

However, it would be quite better (if you have bash version 4.0 or better) to use readarray and loop with the array elements:

#!/bin/dash
filename="infile"

readarray -t lines < "$filename"


for line in "${lines[@]}"; do
    [[ $line == ewq* ]]       && printline="no"
    [[ $printline == "yes" ]] && echo "$line"
    [[ $line == qwe* ]]       && printline="yes"
done

That will avoid most of the issues of using read.

Of course, you could use the recommended (in comments; Thanks, @costas) sed line to get only the lines to be processed:

    #!/bin/bash
filename="foo.txt"

readarray -t lines <<< "$(sed -n '/^qwe.*/,/^ewq.*/p' "$filename")"

for line in "${lines[@]}"; do

     : # Do all your additional processing here, with a clean input.

done

Solution 2

As @Costas pointed out, the correct tool to use for this job is sed:

sed '/qwe/,/ewq/ w other.file' foo.txt

There may be other processing needed on the lines to be printed. That's fine; just do it like so:

sed -e '/qwe/,/ewq/{w other.file' -e 'other processing;}' foo.txt

(Of course, "other processing" isn't a real sed command.) The above is the pattern to use if you need to do your processing after you print the line. If you want to do some other processing and then print a changed version of the line (which seems more likely), you would use:

sed -e '/qwe/,/ewq/{processing;w other.file' -e '}' foo.txt

(Note that it is necessary to put the } into its own argument, otherwise it will be interpreted as part of the other.file name.)

You (the OP) haven't stated what "other processing" you have to do on the lines, or I could be more specific. But whatever that processing is, you can definitely do it in sed—or if that becomes too unwieldy, you could do it in awk with very little change to the above code:

awk '/qwe/,/ewq/ { print > "other.file" }' foo.txt

Then, you have all the power of the awk programming language at your disposal to do processing on the lines before you execute that print statement. And of course awk (and sed) are designed for text processing, unlike bash.

Solution 3

qwe(){ printf %s\\n "$1"; }
ewq(){ :; }
IFS=   ### prep  the  loop, only IFS= once
while  read -r  in
do     case $in in
       (qwe|ewq)
           set "$in"
       ;;
       ("$processing"?)
           "$process"
       esac
       "$1" "$in"
done

That's one really slow way to do it. With a GNU grep and a regular infile:

IFS=
while grep  -xm1 qwe
do    while read  -r  in  &&
            [ ewq != "$in" ]
      do    printf %s\\n "$in"
            : some processing
      done
done <infile

...would at least optimize out half of the inefficient reads...

sed  -ne '/^qwe$/,/^ewq$/H;$!{/^qwe$/!d;}' \
      -e "x;s/'"'/&\\&&/g;s/\n/'"' '/g"    \
      -e "s/\(.*\) .e.*/p '\1/p" <input    |
sh    -c 'p(){  printf %s\\n "$@"
                for l do : process "$l"
                done
          }; . /dev/fd/0'

And that would avoid the inefficiencies of read altogether for most sh's out there, though it does have to print the output twice - once quoted to sh and once unquoted to stdout. It works differently because the . command tends to read input by block rather than by byte for most implementations. Still, it elides ewq - qwe altogether, and would work for streamed input - such as a FIFO.

qwe
asd
xca
asdfarrf
sxcad
asdfa
sdca
dac
dacqa
ea
sdcv
asgfa
sdcv
qwe
a
df
fa
vas
fg
fasdf
qwe
aefawasd
adfae
asdfwe
asdf
era
fbn
tsgnjd
nuydid
hyhnydf
gby
asfga
dsg
qwe
rtargt
raga
adfgasgaa
asgarhsdtj
shyjuysy
sdgh
jstht
qwe
asfdg5ab
fgshtsadtyh
wafbvg
nasfga
ghafg
qwe
afghta
asg56ang
adfg643
5aasdfgr5
asdfg
fdagh5t

106,449

gkmohit

I am an Entrepreneur, Web Designer and Online Business Consultant. My mission is to help small businesses grow by leveraging the power of the internet. I believe in automating tasks by using tools so that you can focus on your core business. I have always been a curious person. The first time I used a computer was in grade 8 and fascinated by how you could create digital art using Corel Draw. In class 10, I had the opportunity to use the first mobile phone, and I was very intrigued by how the OS integrated with the hardware. That same curiosity led me to write my first piece of code in grade 10, and I then realized the power a programmer had in this world. In the mid-2011 family and I moved from Bangalore, India to Toronto, Canada, where I started my undergraduate degree in Computer Science at York University. As a student, I couldn't wait to get some industry experience, and I was fortunate to land my first job in IT at the University Information Technology department. I started as a Technical Analyst and slowly grew to be a software developer at the Student Information System. Gaining some industry experience gave me the confidence to go and attend a few hackathons across North America. I was fortunate to win a few awards from companies like Google, IBM, Bank of Nova Scotia and more while attending hackathons. With the help of my awards, experience and my skills, I started my internship at SAP Labs in Waterloo, Canada. My course was great, but I was seeking something more challenging, so my hackathon team members and I decided to start a fast-growing development shop Hyfer Technologies. At Hyfer Technologies, I stumbled upon Product Management and Business Analysis while managing a team of developers remotely. So far, I have been able to work with 10+ clients from conception to production. As a product manager, I have had a few failed projects but also some that are still growing strong. As of March 2020, I am working with The Ottawa Hospital as a Business Analyst. As a Product Manager & Business Analyst, my skills include but are not limited to: Management Strategy Growth Strategy Customer, partner and client relations, Organizational Design Process Improvements Statistical Analysis and Data Mining Marketing and Brand Strategy Running Product-Related Sessions Managing technical team Through these skills and experience, I am confident I can add a lot of values to any growing team. I am always open to learning more about you and your business. Feel free to reach out to me or follow me on LinkedIn.

Updated on September 18, 2022

Comments

gkmohit almost 2 years
I have a file foo.txt
```
test
qwe
asd
xca
asdfarrf
sxcad
asdfa
sdca
dac
dacqa
ea
sdcv
asgfa
sdcv
ewq
qwe
a
df
fa
vas
fg
fasdf
eqw
qwe
aefawasd
adfae
asdfwe
asdf
era
fbn
tsgnjd
nuydid
hyhnydf
gby
asfga
dsg
eqw
qwe
rtargt
raga
adfgasgaa
asgarhsdtj
shyjuysy
sdgh
jstht
ewq
sdtjstsa
sdghysdmks
aadfbgns,
asfhytewat
bafg
q4t
qwe
asfdg5ab
fgshtsadtyh
wafbvg
nasfga
ghafg
ewq
qwe
afghta
asg56ang
adfg643
5aasdfgr5
asdfg
fdagh5t
ewq
```
I want to print all the lines between qwe and ewq in a separate file. This is what I have so far :
```
#!/bin/bash

filename="foo.txt"

#While loop to read line by line
while read -r line
do
    readLine=$line
    #If the line starts with ST then echo the line
    if [[ $readLine = qwe* ]] ; then
        echo "$readLine"
        read line
        readLine=$line
        if [[ $readLine = ewq* ]] ; then
            echo "$readLine"
        fi
    fi
done < "$filename"
```
- Costas over 8 years
  
  sed '/qwe/,/ewq/ w other.file' foo.txt
- gkmohit over 8 years
  
  @Costas I cant use that because I need to do some logic between these lines . .
- don_crissti over 8 years
  
  Sure you can... If you're using while..read you're almost always doing it wrong.
- gkmohit over 8 years
  
  @don_crissti what do you mean ?
- gkmohit over 8 years
  
  @don_crissti well I just wanted to know why while .. read is wrong ?
- don_crissti over 8 years
  
  read this
- Admin over 8 years
  
  @don_crissti And the solution to such problem is?
- don_crissti over 8 years
  
  @BinaryZebra - what problem ? If you know what the real problem is here please explain it so that we can understand it because the question is (obviously) some sort of XY question but the OP doesn't even bother explaining what X is (the real problem).
msw over 8 years

This answer very nicely answers the question of "how do I cope with most of the issues of using read". Sometimes though, the OP needs to be unasked: using read for this task requires all your explanation and is manifestly the wrong tool for the job when half-line sed comment just works. (±0 if it matters).
Admin over 8 years

The solutions do not provide a way for the OP to "I need to do some logic between these lines" in shell. Yes, sed or awk may probably do the same (unknown) processing.
Wildcard over 8 years

@BinaryZebra, see the last sentence of my answer. He didn't say he needs to do the logic in shell, just that he needs to do some logic. But it's good that your answer provided a way he can do the logic in shell if he needs to.
terdon over 8 years

@mikeserv BinaryZebra, please stop having this sort of discussion in the comment threads. Comments are not for extended discussion; this conversation has been moved to chat.
Kusalananda over 6 years

Brevity is acceptable, but fuller explanations are better.