Compare two file columns

17,232

Solution 1

In order to use join, you need to make sure that FILE1 and FILE2 are sorted on the join fields.

The following command should do the trick:

join -v1 <(sort file1.txt) <(sort file2.txt)

Solution 2

Like this, but it doesn't include the header line:

$ awk '{print $1}' file2.txt | grep -vf - file1.txt
44888 56565 45554 6868
77765 88688 87464 6848

Note: I adjusted this to match the example output, not your description. If you want it the other way just switch file1 and file2.

Breaking this down:

  • awk prints just field 1 from file2.txt
  • grep -v inverts the match (prints non-matching lines)
  • -f - tells grep to read the list of match patterns from a file, in this case - (STDIN), which was piped in from awk
Share:
17,232

Related videos on Youtube

Baraskar Sandeep
Author by

Baraskar Sandeep

Updated on September 18, 2022

Comments

  • Baraskar Sandeep
    Baraskar Sandeep over 1 year

    I have long text files with space-delimited fields:

    cat file1.txt
    Id    leng  sal   mon
    25671 34343 56565 5565
    44888 56565 45554 6868
    23343 23423 26226 6224
    77765 88688 87464 6848
    66776 23343 63463 4534
    
    cat file2.txt
    Id    number
    25671 34343 
    76767 34234 
    23343 23423 
    66776 23343 
    
    cat output.txt
    Id    leng  sal   mon
    44888 56565 45554 6868
    77765 88688 87464 6848
    

    file1.txt has four columns, file2.txt has two columns. I want to compare 1st column ($1) in both files (file1.txt, file2.txt) and output the file that did not match in file2.txt.

    I have tried

    join -v1 file1.txt file2.txt >output.txt
    

    But the output has some errors. Any awk/sed command is appreciated.

  • Baraskar Sandeep
    Baraskar Sandeep over 11 years
    Does join command work for more than two files?. If I want to compare 1st column ($1) in all four files (file1.txt, file2.txt, file3.txt & file4.txt) and output the file that did not match with file1.txt.