Reduce multiple blank lines to single (Pythonically)

15,145

Solution 1

This is a reach, but perhaps some of the lines aren't completely blank (i.e. they have only whitespace characters that give the appearance of blankness). You could try removing all possible whitespace between newlines.

re.sub(r'(\n\s*)+\n+', '\n\n', sourceFileContents)

Edit: realized the second '+' was superfluous, as the \s* will catch newlines between the first and last. We just want to make sure the last character is definitely a newline so we don't remove leading whitespace from a line with other content.

re.sub(r'(\n\s*)+\n', '\n\n', sourceFileContents)

Edit 2

re.sub(r'\n\s*\n', '\n\n', sourceFileContents)

Should be an even simpler solution. We really just want to a catch any possible space (which includes intermediate newlines) between our two anchor newlines that will make the single blank line and collapse it down to just the two newlines.

Solution 2

Your code works for me. Maybe there is a chance of carriage return \r would be present.

re.sub(r'[\r\n][\r\n]{2,}', '\n\n', sourceFileContents)

Solution 3

You can use just str methods split and join:

text = "some text\n\n\n\nanother line\n\n"
print("\n".join(item for item in text.split('\n') if item))

Solution 4

If the lines are completely empty, you can use regex positive lookahead to replace them with single lines:

sourceFileContents = re.sub(r'\n+(?=\n)', '\n', sourceFileContents)

regex positive lookahead

Solution 5

If you replace your read statement with the following, then you don't have to worry about whitespace or carriage returns:

with open(sourceFileName, 'rt') as sourceFile:
    sourceFileContents = ''.join([l.rstrip() + '\n' for l in sourceFile])

After doing this, both of your methods you tried in the OP work.

OR

Just write it out in a simple loop.

with open(sourceFileName, 'rt') as sourceFile:
    lines = ['']
    for line in (l.rstrip() for l in sourceFile):
        if line != '' or lines[-1] != '\n':
            lines.append(line + '\n')
    sourceFileContents = "".join(lines)
Share:
15,145

Related videos on Youtube

Mawg says reinstate Monica
Author by

Mawg says reinstate Monica

Donate a cup of food for free: Click to Give @ The Hunger Site SOreadytohelp

Updated on September 16, 2022

Comments

  • Mawg says reinstate Monica
    Mawg says reinstate Monica over 1 year

    How can I reduce multiple blank lines in a text file to a single line at each occurrence?

    I have read the entire file into a string, because I want to do some replacement across line endings.

    with open(sourceFileName, 'rt') as sourceFile:
        sourceFileContents = sourceFile.read()
    

    This doesn't seem to work

    while '\n\n\n' in sourceFileContents:
        sourceFileContents = sourceFileContents.replace('\n\n\n', '\n\n')
    

    and nor does this

    sourceFileContents = re.sub('\n\n\n+', '\n\n', sourceFileContents)
    

    It's easy enough to strip them all, but I want to reduce multiple blank lines to a single one, each time I encounter them.

    I feel that I'm close, but just can't get it to work.

  • Chris Hagmann
    Chris Hagmann about 9 years
    You could do lines[-1] instead of last line. Also just doing line.rstrip() would strip all whitespace from the end of a line (which is a good thing to do any way) and return an empty string.
  • Mawg says reinstate Monica
    Mawg says reinstate Monica about 9 years
    Lol - now you have two problems :-) For those who don't recognize the quote, see programmers.stackexchange.com/questions/223634/…
  • Mawg says reinstate Monica
    Mawg says reinstate Monica about 9 years
    will that only remove the space, or also reduce the multiple blank lines?
  • Marc Chiesa
    Marc Chiesa about 9 years
    Should do both, at least from my simple test. It should not remove whitespace from the beginning of a line with other content.
  • LexyStardust
    LexyStardust about 9 years
    @cdhagmann - wouldn't work in this case - I always want to see the last line from the file, not the last line I've added to the list.