Python append multiple files in given order to one big file

66,248

Solution 1

Just using simple file IO:

# tempfiles is a list of file handles to your temp files. Order them however you like
f = open("bigfile.txt", "w")
for tempfile in tempfiles:
    f.write(tempfile.read())

That's about as OS agnostic as it gets. It's also fairly simple, and the performance ought to be about as good as using anything else.

Solution 2

Not aware of any shell-level commands for appending one file to another. But appending at 'python level' is sufficiently easy that I am guessing python developers did not think it necessary to add it to the library.

The solution depends on the size and structure of the temp files you are appending. If they are all small enough that you don't mind reading each of them into memory, then the answer from Rafe Kettler (copied from his answer and repeated below) does the job with the least amount of code.

# tempfiles is an ordered list of temp files (open for reading)
f = open("bigfile.txt", "w")
for tempfile in tempfiles:
    f.write(tempfile.read())

If reading files fully into memory is not possible or not an appropriate solution, you will want to loop through each file and read them piece-wise. If your temp file contains newline-terminated lines which can be read individually into memory, you might do something like this

# tempfiles is an ordered list of temp files (open for reading)
f = open("bigfile.txt", "w")
for tempfile in tempfiles:
    for line in tempfile
        f.write(line)

Alternatively - something which will always work - you may choose a buffer size and just read the file piece-wise, e.g.

# tempfiles is an ordered list of temp files (open for reading)
f = open("bigfile.txt", "w")
for tempfile in tempfiles:
    while True:
        data = tempfile.read(65536)
        if data:
            f.write(data)
        else:
            break

The input/output tutorial has a lot of good info.

Solution 3

Rafe's answer was lacking proper open/close statements, e.g.

# tempfiles is a list of file handles to your temp files. Order them however you like
with open("bigfile.txt", "w") as fo:
     for tempfile in tempfiles:
          with open(tempfile,'r') as fi: fo.write(fi.read())

However, be forewarned that if you want to sort the contents of the bigfile, this method does not catch instances where the last line in one or more of your temp files has a different EOL format, which will cause some strange sort results. In this case, you will want to strip the tempfile lines as you read them, and then write consistent EOL lines to the bigfile (i.e. involving an extra line of code).

Solution 4

I feel a bit stupid to add another answer after 8 years and so many answers, but I arrived here by the "append to file" title, and didn't see the right solution for appending to an existing binary file with buffered read/write.

So here is the basic way to do that:

def append_file_to_file(_from, _to):
    block_size = 1024*1024
    with open(_to, "ab") as outfile, open(_from, "rb") as infile:
        while True:
            input_block = infile.read(block_size)
            if not input_block:
                break
            outfile.write(input_block)

Given this building block, you can use:

for filename in ['a.bin','b.bin','c.bin']:
    append_file_to_file(filename, 'outfile.bin')

Solution 5

import os
str = os.listdir("./")

for i in str:
    f = open(i)
    f2 = open("temp.txt", "a")
    for line in f.readlines():
        f2.write(line)

We can use above code to read all the contents from all the file present in current directory and store into temp.txt file.

Share:
66,248
Martlark
Author by

Martlark

Lead developer working in TypeScript and workflow automation.

Updated on September 16, 2021

Comments

  • Martlark
    Martlark over 2 years

    I have up to 8 seperate Python processes creating temp files in a shared folder. Then I'd like the controlling process to append all the temp files in a certain order into one big file. What's the quickest way of doing this at an os agnostic shell level?

  • Vitali
    Vitali almost 10 years
    This would only work if each individual file is small enough to be read into memory. There's also worse performance if you can read & write in parallel (e.g. different disks or architecture allows for such) as you'll be waiting for the file to be read before you start anything. You'll probably be better off using shutil.copyfileobj
  • user1277476
    user1277476 over 9 years
    Perhaps should use binary I/O.
  • baxx
    baxx about 9 years
    would be interesting to know when one solution would be more appropriate, how much memory a file uses before f.write(tempfile.read()) becomes inappropriate
  • Kai Petzke
    Kai Petzke over 3 years
    It would be nice, if you add an example about how to use fileinput
  • Admin
    Admin over 2 years
    As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.