Compare two files and print unmatched lines

60,859

Solution 1

This seems like a perfect opportunity to use comm.

From the GNU coreutils manual page (v8.30):

With  no  options,  produce  three-column  output.  Column one contains
lines unique to FILE1, column two contains lines unique to  FILE2,  and
   column three contains lines common to both files.

   -1     suppress column 1 (lines unique to FILE1)

   -2     suppress column 2 (lines unique to FILE2)

   -3     suppress column 3 (lines that appear in both files)

Using this information, we can remove the lines unique file1 as well as the lines present in both files.

$ comm -1 -3 <(sort file1) <(sort file2)
12344 Dec 10 15:36 /opt/apache-tomcat-6.0.36/webapps/abc/.../test.txt
22677 Dec 3 15:36 /opt/apache-tomcat-6.0.36/webapps/new/abc.txt

-1 and -3 removes all lines unique to file 1 and all lines common to both.

Because of the sort, it will change the order of the output but that doesn't seem to be a consideration based on the question.

If the input is already sorted, you can skip the sorts yielding

$ comm -1 -3 file1 file2

Solution 2

Use diff -u file1 file2 | sed -nr 's/^+([^+].*)/\1/p'

Output:

22677 Dec 3 15:36 /opt/apache-tomcat-6.0.36/webapps/new/abc.txt
12344 Dec 10 15:36 /opt/apache-tomcat-6.0.36/webapps/abc/.../test.txt

If you need the blank line between them, use
diff -u file1 file2 | sed -nr 's/^+([^+].*)/\1\n/p'

Output:

22677 Dec 3 15:36 /opt/apache-tomcat-6.0.36/webapps/new/abc.txt

12344 Dec 10 15:36 /opt/apache-tomcat-6.0.36/webapps/abc/.../test.txt

Share:
60,859

Related videos on Youtube

Sak
Author by

Sak

Updated on September 18, 2022

Comments

  • Sak
    Sak over 1 year

    I have two files with the below data; I need the difference between two files.

    I tried with diff but it also shows line which are common in the two files: (22372 Dec 4 15:36 /opt/apache-tomcat-6.0.36/webapps/new/new.txt).

    First file: (multiple data exists in the same way in file 1)

    22677 Dec 4 15:36 /opt/apache-tomcat-6.0.36/webapps/new/abc.txt
    
    22372 Dec 4 15:36 /opt/apache-tomcat-6.0.36/webapps/new/new.txt

    Second file: (multiple data exists in the same way in file 2).

    22372 Dec 4 15:36 /opt/apache-tomcat-6.0.36/webapps/new/new.txt
    
    22677 Dec 3 15:36 /opt/apache-tomcat-6.0.36/webapps/new/abc.txt
    
    12344 Dec 10 15:36 /opt/apache-tomcat-6.0.36/webapps/abc/.../test.txt

    I need the below output:

    22677 Dec 3 15:36 /opt/apache-tomcat-6.0.36/webapps/new/abc.txt
    
    12344 Dec 10 15:36 /opt/apache-tomcat-6.0.36/webapps/abc/.../test.txt
  • Sak
    Sak over 9 years
    I tried with this diff --suppress-common-lines file1 file2 but still it shows the common files
  • iyrin
    iyrin over 9 years
    Updated and tested. Is this the output you need?
  • Sak
    Sak over 9 years
    When there are 2-3 lines in each file, it works fine. I have multiple lines in both files (around 4000 lines) and in that case it's not working. Same line exists in both files but not in same order and i need to exclude them in ouput.
  • Sak
    Sak over 9 years
    Thanks... grep -xvFf and comm -23 <(sort file1.txt) <(sort file2.txt) resolves the issue
  • alexises
    alexises over 9 years
    If lines can exist on another oder, sort the two file using sort before applying the above command
  • Mustapha-Belkacim
    Mustapha-Belkacim over 3 years
    from comm man page: -1 suppress column 1 (lines unique to FILE1) -2 suppress column 2 (lines unique to FILE2) -3 suppress column 3 (lines that appear in both files)