intersection of two files according to the first column

6,600

You want join (1), I guess:

For each pair of input lines with identical join fields, write a line to standard output. The default join field is the first, delimited by whitespace. When FILE1 or FILE2 (not both) is -, read standard input.

[0 1075 12:50:10] ~/temp/sx % join A B
1 kfjk 3243424
3 iefjk 21493402
8 kfkdlkf 309834
join: file 1 is not in sorted order

OK, so apparently you need to combine this with sort (1) to sort by alpha value (not numerical value, so 20 < 3)

join <(sort A) <(sort B) works for me, but that looks weird and might be a zsh extension. There's no harm in doing

sort A > A.tmp; sort B > B.tmp; join A.tmp B.tmp

(As usual, check the man pages for pitfalls.)

Share:
6,600

Related videos on Youtube

wenzi
Author by

wenzi

Updated on September 18, 2022

Comments

  • wenzi
    wenzi over 1 year

    I have two files in file A, there are sequence_numbers in the other file B, there are many columns, and the first column is sequnce numbers, I want to get a files with all the lines in the B with the sequence numbers which are in the A how can I achieve this? thanks

    like file A

    1
    3
    8
    9
    20
    

    file B

    1 kfjk 3243424
    2 fkdkf 23543592
    3 iefjk 21493402
    7 dlafdl 23435231
    8 kfkdlkf 309834
    
    • Gilles Quenot
      Gilles Quenot over 11 years
      Better provide sample input & output.
  • Will Vousden
    Will Vousden about 8 years
    Process substitution (join <(sort A) <(sort B)) works just fine in bash :-)