How to remove multiple blank lines from a file?

14,701

Solution 1

Case 1:

awk '!NF {if (++n <= 2) print; next}; {n=0;print}'

Case 2:

awk '!NF {s = s $0 "\n"; n++; next}
     {if (n>1) printf "%s", s; n=0; s=""; print}
     END {if (n>1) printf "%s", s}'

Solution 2

You can use uniq to collapse multiple instance of blank lines into one blank line, but it will also collapse lines which contain text if they are the same and below each other.

Solution 3

Case 1:

perl -i -ane '$n=(@F==0) ? $n+1 : 0; print if $n<=2'

Case 2:

perl -i -ane '$n=(@F==0) ? $n+1 : 0; print $n==2 ? "\n$_" : $n==1 ? "" : $_ '

Solution 4

You can address Case #1 like this with GNU sed:

sed -r ':a; /^\s*$/ {N;ba}; s/( *\n *){2,}/\n\n/'

That is, collect empty lines in pattern space, and if there are more than three or more lines, reduce it to two lines.

To join single-spaced lines, as in Case #2, you can do it like this:

sed -r '/^ *\S/!b; N; /\n *$/!b; N; /\S *$/!b; s/\n *\n/\n/'

Or in commented form:

sed -r '
  /^ *\S/!b        # non-empty line
  N                # 
  /\n *$/!b        # followed by empty line
  N                # 
  /\S *$/!b        # non-empty line
  s/\n *\n/\n/     # remove the empty line
'

Solution 5

Following Anthon's suggestion to use "uniq"...

Remove leading, trailing and duplicate blank lines.

# Get large random string.
rand_str=; while [[ ${#rand_str} -lt 40 ]]; do rand_str=$rand_str$RANDOM; done

# Add extra lines at beginning and end of stdin.
(echo $rand_str; cat; echo $rand_str) |

# Convert empty lines to random strings.
sed "s/^$/$rand_str/" |

# Remove duplicate lines.
uniq |

# Remove first and last line.
sed '1d;$d' |

# Convert random strings to empty lines.
sed "s/$rand_str//"

In one long line:

(rand_str=; while [[ ${#rand_str} -lt 40 ]]; do rand_str=$rand_str$RANDOM; done; (echo $rand_str; cat; echo $rand_str) | sed "s/^$/$rand_str/" | uniq | sed '1d;$d' | sed "s/$rand_str//")

Or just use "cat -s".

I switched from parenthesis to curly braces in order to remain in the current shell context which I assume is more efficient. Note that curly braces require semicolon after last command and need a space for separation.

# Add extra blank lines at beginning and end.
# These will be removed in final step.
{ echo; cat; echo; } |

# Replace multiple blank lines with a single blank line.
cat -s |

# Remove first and last line.
sed '1d;$d'

In a single line.

{ { echo; cat; echo; } | cat -s | sed '1d;$d'; }
Share:
14,701

Related videos on Youtube

Baard Kopperud
Author by

Baard Kopperud

38 yo male from Lillehammer in Norway. Interested in computers, Internet, programming, web-design and electronics. Watch lots of TV. Love to read fanfiction.

Updated on September 18, 2022

Comments

  • Baard Kopperud
    Baard Kopperud over 1 year

    I have some text-files I use to take notes in - just plain text, usually just using cat >> file. Occasionally I use a blank line or two (just return - the new-line character) to specify a new subject/line of thought. At the end of each session, before closing the file with Ctrl+D, I typically add lots (5-10) blank lines (return-key) just to separate the sessions.

    This is obviously not very clever, but it works for me for this purpose. I do however end-up with lots and lots of unnecessary blank lines, so I'm looking for a way to remove (most of) the extra lines. Is there a Linux-command (cut, paste, grep, ...?) that could be used directly with a few options? Alternatively, does anybody have an idea for a sed, awk or perl (well in any scripting-language really, though I'd prefer sed or awk) script that would do what I want? Writing something in C++ (which I actually could do myself), just seems like overkill.

    Case #1: What I need is a script/command that would remove more than two (3 or more) consecutive blank lines, and replace them with just two blank lines. Though it would be nice if it also could be tweaked to remove more than one line (2 or more) and/or replace multiple blank lines with just one blank line.

    Case #2: I could also use a script/command that would remove a single blank line between two lines of text, but leave multiple blank lines as is (though removing one of the blank lines would also be acceptable).

  • Rob
    Rob about 11 years
    +1 for awk instead of sed
  • ChuckCottrill
    ChuckCottrill over 10 years
    +1 perl ftw! Awk is (probably) canonical for this, but (DRY) compels me to write scripts for use-cases that are repeated like this.
  • ChuckCottrill
    ChuckCottrill over 10 years
    Since this use case is repeated frequently, I would suggest creating a script.
  • Michael Bushe
    Michael Bushe over 2 years
    Slick and it works (assuming no dup lines, like in code and config). Leaves some dups but only if the lines have whitespace chars.