Remove lines based on pattern but keeping first n lines that match
Solution 1
If you want to delete all lines starting with % put preserving the first two lines of input, you could do:
sed -e 1,2b -e '/^%/d'
Though the same would be more legible with awk
:
awk 'NR <= 2 || !/^%/'
Or, if you're after performance:
{ head -n 2; grep -v '^%'; } < input-file
If you want to preserve the first two lines matching the pattern while they may not be the first ones of the input, awk
would certainly be a better option:
awk '!/^%/ || ++n <= 2'
With sed
, you could use tricks like:
sed -e '/^%/!b' -e 'x;/xx/{h;d;}' -e 's/^/x/;x'
That is, use the hold space to count the number of occurrences of the patterns matched so far. Not terribly efficient or legible.
Solution 2
I'm afraid sed
alone is a bit too simple for this (not that it would be impossible, rather complicated - see e.g. sed sokoban for what can be done).
How about awk
?
#!/bin/awk -f
BEGIN { c = 0; }
{
if (/^%/) {
if (c++ < 3) {
print;
}
} else {
print;
}
}
If you can rely on using recent enough BASH (which supports regular expressions), the awk above can be translated to:
#!/bin/bash -
c=0
while IFS= read -r line; do
if [[ $line =~ ^% ]]; then
if ((c++ < 3)); then
printf '%s\n' "$line"
fi
else
printf '%s\n' "$line"
fi
done
You can also use sed
or grep
to do the pattern matching instead of the =~
operator.
Solution 3
A Perl one-liners solution:
# in-place editing
perl -i -pe '$.>2 && s/^%.*//s' filename.txt
# print to the standard output
perl -ne '$.>2 && /^%/ || print' filename.txt
Solution 4
sed '/^%/{
3,$d}' '% 1
% 2
% 3
% 4
% 5
text1
text2
text3'
One way of removing the extra lines.
Edit: my answer works under the same condition as Stephane Chazelas
's if the % rows doesn't occur first, it won't work.
Nerd sniping.
sed -n '/^% [^12]*$/!{
/^% [12][[:digit:]]\{1,\}/n
p}' file.txt
Will work regardless of where the % number
string is found in the stream.
Any line that starts with %
and ends with any number of characters besides 1
or 2
, which we negate. That address matches anything besides /% [A-Za-z3-9]*/
leaving an blind spot. Numbers between 10-29 will print still. So we nest a second address to match that range and skip the line.
But awk would still be better.
Solution 5
tr '\n' ';' < input | sed 's/% /##/3g' | tr ';' '\n' | sed '/##/d'
I replaced new line characters with ';' to obtain single line string, then turned all but first two occurrences of pattern into ## marking with sed 's/pattern/##/3g' (replace from third to last occurrence of pattern in line), changed back ';' to '\n' and finally removed marked lines.
Related videos on Youtube
GreeneScreen
Updated on September 18, 2022Comments
-
GreeneScreen over 1 year
I have a asp.net page with c# code behind. I have a first panel where the user selects and enters information, they then click continue and that data is stored in variable. A new panel displays on screen and the select some new data which when they click continue stores that data in that panel and sends all the information to a c# program. The problem I an getting is that when I click continue the first time and the page refreshing showing only the new panel all the data defaults to 0. How can I fix this?
Thanks
-
Admin almost 13 yearsAre you using asp.net webforms or mvc?
-
Turnkey almost 13 yearsSounds like probably webforms with code behind.
-
-
Stéphane Chazelas over 11 yearsTo match a line starting with % in shell, no need for regexps or ksh/bash specific features like
[[
, you can usecase $line in %*)
. Doing it this way with shells, especially bash, is going to be terribly inefficient. Using loops in shells is generally considered bad practice. -
Jana over 11 yearsThanks @Stephane. It worked. Thanks for the additional info as well.
-
Jana over 11 yearsThanks @peterph. Since my files are huge, I was really looking for something like Stephane's answer. Thanks again
-
Jana over 11 yearsThanks @Nykakin. The pattern replacing for my data won't be efficient. Thank you for your input
-
peterph over 11 years@Jana No problem, it wasn't just really clear to me, whether the lines matching he pattern were supposed to be only at the beginning of the file or interspersed with the rest. That's why I used the loops.