Compare two CSV files and print the rows that are different Python

11,453

Not sure how effective is this but IMO does what you want:

import csv
with open('English.csv', 'rb') as csvfile1:
    with open ("Dictionary.csv", "rb") as csvfile2:
        reader1 = csv.reader(csvfile1)
        reader2 = csv.reader(csvfile2)
        rows1_col_a = [row[0] for row in reader1]
        rows2 = [row for row in reader2]
        only_b = []
        for row in rows2:
            if row[0] not in rows1_col_a:
                only_b.append(row)
        print only_b

Outputs:

[['d', 'disease'], ['bc', 'breast cancer']]
Share:
11,453
abn
Author by

abn

Updated on June 04, 2022

Comments

  • abn
    abn almost 2 years

    I'm trying to compare two csv files that are like below

    English.csv
    i
    am
    is
    was
    were
    
    Dictionary.csv
    i,insomnia
    d,disease
    bc,breast cancer
    

    I'm trying to compare the first columns in two files and print the rows that are different from Dictionary.csv like below

    final.csv
    d,disease
    bc,breast cancer
    

    I tried this code.

    import csv
    with open('English.csv', 'rb') as csvfile1:
        with open ("Dictionary.csv", "rb") as csvfile2:
            reader1 = csv.reader(csvfile1)
            reader2 = csv.reader(csvfile2)
            rows1 = [row for row in reader1]
            rows2 = [row for row in reader2]
            col_a = [row1[0] for row1 in rows1]
            col_b = [row2[0] for row2 in rows2]
            col_c = [row2[1] for row2 in rows2]
            only_b = [text for text in col_b if not text in col_a]
    

    I can get data from first column that is different, but not from the second column like below. How can I get the corresponding data from second column?

    >>>only_b
    ['d','bc']