Reading two text files line by line simultaneously

52,727

Solution 1

from itertools import izip

with open("textfile1") as textfile1, open("textfile2") as textfile2: 
    for x, y in izip(textfile1, textfile2):
        x = x.strip()
        y = y.strip()
        print("{0}\t{1}".format(x, y))

In Python 3, replace itertools.izip with the built-in zip.

Solution 2

with open(file1) as f1, open(fil2) as f2:
  for x, y in zip(f1, f2):
     print("{0}\t{1}".format(x.strip(), y.strip()))

output:

This is a the first line in English C'est la première ligne en Français
This is a the 2nd line in English   C'est la deuxième ligne en Français
This is a the third line in English C'est la troisième ligne en Français

Solution 3

We could use generator for more convenient file opening, and it could easily support to iterator on more files simultaneously.

filenames = ['textfile1', 'textfile2']

def gen_line(filename):
    with open(filename) as f:
        for line in f:
            yield line.strip()

gens = [gen_line(n) for n in filenames]

for file1_line, file2_line in zip(*gens):
    print("\t".join([file1_line, file2_line]))

Note:

  1. This is python 3 code. For python 2, use itertools.izip like other people said.
  2. zip would stop after the shortest file is iterated over, use itertools.zip_longest if it matters.

Solution 4

Python does let you read line by line, and it's even the default behaviour - you just iterate over the file like would iterate over a list.

wrt/ iterating over two iterables at once, itertools.izip is your friend:

from itertools import izip
fileA = open("/path/to/file1")
fileB = open("/path/to/file2")
for lineA, lineB in izip(fileA, fileB):
    print "%s\t%s" % (lineA.rstrip(), lineB.rstrip())
Share:
52,727
alvas
Author by

alvas

食飽未?

Updated on November 04, 2020

Comments

  • alvas
    alvas over 3 years

    I have two text files in two different languages and they are aligned line by line. I.e. the first line in textfile1 corresponds to the first line in textfile2, and so on and so forth.

    Is there a way to read both file line-by-line simultaneously?

    Below is a sample of how the files should look like, imagine the number of lines per file is around 1,000,000.

    textfile1:

    This is a the first line in English
    This is a the 2nd line in English
    This is a the third line in English
    

    textfile2:

    C'est la première ligne en Français
    C'est la deuxième ligne en Français
    C'est la troisième ligne en Français
    

    desired output

    This is a the first line in English\tC'est la première ligne en Français
    This is a the 2nd line in English\tC'est la deuxième ligne en Français
    This is a the third line in English\tC'est la troisième ligne en Français
    

    There is a Java version of this Read two textfile line by line simultaneously -java, but Python doesn't use bufferedreader that reads line by line. So how would it be done?

  • BasedRebel
    BasedRebel almost 12 years
    Be aware that zip() will pull the full contents of both files into memory (in Python 2.x)
  • Martijn Pieters
    Martijn Pieters almost 12 years
  • jmtoung
    jmtoung about 10 years
    I get the issue: File "MergeANNOVARResults.py", line 10 with open(refseq) as refseq_fh, open(gencode) as gencode_fh: ^ SyntaxError: invalid syntax
  • user58925
    user58925 over 5 years
    Would this load all of textfile1 and textfile2 to memory?