Avoid to write '\n' to the last line of a file in python

10,328

Solution 1

This should be a simple solution:

for item in record[:-1]:
    output_pass.write("%s\n" % item)
output_pass.write("%s" % record[-1])

Using join is not recommended if you said the file was large - it will create the entire file content string in memory.

Solution 2

This requires constant additional memory:

for i, item in enumerate(record):
    if i>0: 
        output_pass.write('\n')
    output_pass.write('%s' %item)

Solution 3

do you try with some counter? like:

record = [str(x) for x in range(10)]
print record

import sys
output_pass=sys.stdout

counter = 0

while counter != (len(record))-1:
   output_pass.write("%s\n" % record[counter])
   counter += 1

Solution 4

You can join them first and then write as in

item = '\n'.join(record)
output_pass.write('%s' %item)

Note

If your list, i.e. record doesn't contain strings, then as Martinaeu has mentioned you will have to map it to a str that is, '\n'.join(map(str, record)) before you write to file. (In py2)

Solution 5

The following would write all but the last item in record with newlines very quickly and then the final one without it. It will do so without requiring much additional memory.

(For Python 3 use range instead of xrange)

item = iter(record)
for _ in xrange(len(record)-1):
    output_pass.write('%s\n' % next(item))

output_pass.write('%s' % next(item))
Share:
10,328
Zewei Song
Author by

Zewei Song

Updated on June 09, 2022

Comments

  • Zewei Song
    Zewei Song almost 2 years

    I'm writing multiple lines to a new file (could be up to several GB), like this:

    for item in record:
        output_pass.write('%s\n' %item)
    

    However, I got a blank line due to the '\n' of my last record, such as:

    Start of the file

    record111111
    
    reocrd222222
    
    record333333
    
    ---a blank line---
    

    End of a file

    Since my file is large, I would not want to read the file again. So, is there an easy way to prevent this, or easy way to remove the last '\n' from the file?

    My solution:

    Thanks for all the help!

    I think I will not load the entire file to the memeory, since it may get very huge.

    I actually solve this by first write the first record, then write the rest line in a loop. I put '\n' in the front so it won't appear on the last line.

    But Jonathan is right. I actually have now problem with the '\n' in the last line, majorly it is my OCD.

    Here is my code:

    rec_first = parser_fastq.next() #This is just an iterator of my file
    output.write('%s' %('>'+rec_first[0].strip('@')))
    output.write('\n%s' %(rec_first[1])) #I put '\n' in the front
    
    count = 1
    
    #Write the rest of lines
    for rec_fastq in parser_fastq:
        output.write('\n%s' %('>'+rec_fastq[0].strip('@')))
        output.write('\n%s' %(rec_fastq[1]))
        count += 1
        print 'Extracting %ith record in %s ...' %(count, fastq_name) + '\b'*100,
    
    output.close()
    

    print '\n%i records were wrote to %s' % (count, fasta_name)

    • Matteo Italia
      Matteo Italia over 9 years
      Are you sure that it's really a problem? Actually, most text-based tools (e.g. most Unix utils) expect to have a newline at the end of the file (i.e. the newline is intended as a line terminator, not as a separator).
    • martineau
      martineau over 9 years
      Do you really want all those other blank lines between items in your output file? It looks the each is ending up with two '\n' characters.
    • martineau
      martineau over 9 years
      Is the file huge because a single record has that much data in it, or are you processing many records that could total up to a size that big? The answer to that will likely affect what answer is truly the best for your needs.
  • Matteo Italia
    Matteo Italia over 9 years
    OP is talking about a multi-gigabyte file, in that case this is definitely a bad idea (it creates the whole string in memory first).
  • Bhargav Rao
    Bhargav Rao over 9 years
    @MatteoItalia Thanks. Do inform me if it is completely wrong, so that I can delete it
  • Matteo Italia
    Matteo Italia over 9 years
    It's not wrong per se, but in this case it would become a performance nightmare (not only you are creating the whole string in memory, but the useless '%s' % item is going to create yet another copy of it).
  • Matteo Italia
    Matteo Italia over 9 years
    How is this supposed to work? Even assuming that record is a list (it may not be, from the code it just looks like an enumerable object) your code just prints it backwards skipping the last element, always leaving the newline anyway. ideone.com/UvNCsJ
  • Bhargav Rao
    Bhargav Rao over 9 years
    This prints the contents of the file in reverse order. So you will need to write another code to do the parse in the other direction ;)
  • Bhargav Rao
    Bhargav Rao over 9 years
    upv for educating as to why join is not recommended
  • yhoyo
    yhoyo over 9 years
    @MatteoItalia yes, Sorry... I do not know when I was thinking; take your code and "rearrange"
  • yhoyo
    yhoyo over 9 years
    @BhargavRao Now the code work fine without the last line :P ideone.com/h8mfa5
  • myaut
    myaut over 9 years
    For lists slice expression [:] creates a copy of that list, so you waste memory too.
  • martineau
    martineau over 9 years
    @Bhargav: Whether avoiding join is necessary or not currently isn't clear from the question since we don't really know how many items there might be in a record nor how big the string representation might of each might be. Like so many questions here, the parameters of the problem are unclear.
  • Bhargav Rao
    Bhargav Rao over 9 years
    @martineau Sir, thus using join is also not wrong?
  • martineau
    martineau over 9 years
    @Bhargav: My point was that it might work OK, but we don't have any way of knowing whether it would or not without additional information from the OP.
  • Bhargav Rao
    Bhargav Rao over 9 years
    Thanks for your help Sir, I undeleted my answer hoping some stray programmer might find it useful. Thanks again
  • martineau
    martineau over 9 years
    Bhargav (& @Matteo): This probably won't work, not because record might be huge (although it could be and that might be an issue), but since it's probably not a sequence of strings (but again we don't know that for sure), which is what the join() method requires its first argument to be -- so, as written, the most the likely result would be TypeError: sequence item 0: expected string, xxx found. That could be easily fixed by using map(str, record) assuming record isn't prohibitively large (and converting each of its items to strings isn't either).
  • Bhargav Rao
    Bhargav Rao over 9 years
    @martineau Thank you Sir. I have edited the answer to include your thoughts. Thanks again
  • martineau
    martineau over 9 years
    Er, I was thinking more along the lines of '\n'.join(map(str, record)) because, among other reasons you would be permanently clobbering the value of record with something only needed temporarily. Actually output_pass.write('\n'.join(map(str, record))) might be even better. Also, please stop calling me "Sir", such formalities aren't needed here. ;-)
  • martineau
    martineau over 9 years
    P.S. r'\n' would be wrong and whether to do this has nothing to with py2.
  • Bhargav Rao
    Bhargav Rao over 9 years
    @martineau Thank you ... (I call you Sir as a mark of respect and not formality ;) ... We have been taught to address every one with a title snce our young age. Apart from that I am too young to talk directly )
  • Bhargav Rao
    Bhargav Rao over 9 years
    That r was a remnant from the previous record
  • loretoparisi
    loretoparisi about 5 years
    This does not work for io.read or io.write: TypeError: '_io.TextIOWrapper' object has no attribute '__getitem__'