list the difference and overlap between two plain data set
19,476
Solution 1
Use the comm
command.
If you lists are in files listA
and listB
:
comm listA listB
By default, comm will return 3 columns. Items only in listA, items only in listB, and items common to both lists.
You can suppress individual columns, with a -1
, -2
, or -3
arg.
Solution 2
This will give you the unique items that exist in A but not in B:
cat A|perl -ne '$z=$_;chomp($z);$y=`grep $z B`;if ($y== "") {print "\n$z";}'|sort -u
This will give you the list of common items in both A and B:
cat A |xargs -i grep {} B|sort -u
Related videos on Youtube
Author by
user1420706
Updated on September 18, 2022Comments
-
user1420706 over 1 year
Possible Duplicate:
Linux tools to treat files as sets and perform set operations on themI have two data sets, A and B. The format for each data set is one number per line. For instance,
12345 23456 67891 2345900 12345
Some of the data in A are not included in data set B. How to list all of these data in A, and how to list all of those data shared by A and B. How can I do that using Linux/UNIX commands?
-
HongboZhu over 9 yearsThe answer assumes listA and listB are already sorted. A more general solution:
comm <(sort listA) <(sort listB)
-
Mike almost 9 yearsVery simple solution. Is the
comm
command deployed in all linux distro?