perl compare two file and print the matching lines

20,380

Solution 1

This isn't the cleanest way to do things... but the hard work has been done. Reverse the logic to make it print everything unless $results{$line} == 1, or if $results{$line} != 1.

To add the count:

print OUTFILE "Count: $results{$line} - $line" if $results{$line} != 1;

Alternatively, you could filter out the unwanted with a grep, avoiding the if condition totally:

foreach my $line ( grep { $results{$_} != 1 } keys %results ) {

    print OUTFILE "Count: $results{$line} - $line";
}

Solution 2

print OUTFILE $line if $results{$line} == 1;

This will print lines that occur only one time.

print OUTFILE $line if $results{$line} > 1;

One small change (== to >), and it will now print lines that occur more than one time. That should print identical duplicate lines.

Oh, also if you want the count, simply do:

if ( $results{$line} > 1 ) {
    print OUTFILE "$results{$line}: ", $line;
}

I wrote a more concise and more flexible version here. It takes optional filenames and prints to STDOUT.

You can put 0 in place of one of the names to compare one of the files against another. Use shell redirection to save it to a file.

Usage:

$ script.pl file1.txt file2.txt > outfile.txt

Code:

use strict;
use warnings;
use autodie;

my $f1 = shift || "/opt/test.txt";
my $f2 = shift || "/opt/test1.txt";
my %results;
open my $file1, '<', $f1;
while (my $line = <$file1>) { $results{$line} = 1 }
open my $file2, '<', $f2;
while (my $line = <$file2>) { $results{$line}++ }
foreach my $line (sort { $results{$b} <=> $results{$a} } keys %results) {
    print "$results{$line}: ", $line if $results{$line} > 1;
}

Solution 3

Try Test::Differences. See here for code sample and how the output would look like:

http://metacpan.org/pod/Test::Differences

Share:
20,380

Related videos on Youtube

eli
Author by

eli

Updated on April 07, 2020

Comments

  • eli
    eli about 4 years

    I have this script which is compare 2 files and print out the diff result. now I want to change the script instead of print out the diff lines, i want to print the matching lines. and also to count how many time matched every time running the script. would you please any one can give me a suggestion. thanks!

    #! /usr/local/bin/perl 
    # compare 
    my $f1 = "/opt/test.txt";
    my $f2 = "/opt/test1.txt";
    my $outfile = "/opt/final_result.txt";
    my %results = (); 
    open FILE1, "$f1" or die "Could not open file: $! \n";
    while(my $line = <FILE1>){   $results{$line}=1;
    }
    close(FILE1); 
    open FILE2, "$f2" or die "Could not open file: $! \n";
    while(my $line =<FILE2>) {  
    $results{$line}++;
    }
    close(FILE2);  
    open (OUTFILE, ">$outfile") or die "Cannot open $outfile for writing \n";
    foreach my $line (keys %results) { print OUTFILE $line if $results{$line} == 1;
    }
    close OUTFILE;
    
  • eli
    eli over 12 years
    thank you so very much! my primary objective is met. my secondary objection is the device matched should be removed from the file. so I want the counter to tell me how many times they are matched. example. the script run every week so the counter adding 1 number every run. so let say after 4 weeks if I see '4' next to the line that means the device out there 4 weeks and if the second line match 3 times that means the device there for 3 weeks and so on. simply my objective is to know for how many weeks each devices are matched.
  • eli
    eli over 12 years
    thanks a lot your answer too full fill my primary objective but not the second one. I think I wasn't clear enough. sorry about that.I want the counter to tell me how many times they are matched. example. the script run every week so the counter adding 1 number every run. so let say after 4 weeks if I see '4' next to the line that means the device out there 4 weeks and if the second line match 3 times that means the device there for 3 weeks and so on. simply my objective is to know for how many weeks each devices are matched.
  • TLP
    TLP over 12 years
    I'm not quite sure what you are asking here and how it is different from what you already have. It is generally speaking better to ask for all the objectives at once here at StackOverflow, rather than trying to puzzle together the solution piece by piece. I think what you are asking for requires a new question, preferably with some sample input/output.
  • eli
    eli over 12 years
    the counter on you solution show how many items are matched my objective is the counter for how long it's matched. the counter on your solution show me "2" even though I did run the script 10 times. my expectation was to show me "10" since the script run 10 times and matched the current list. sorry about the confusion but this was my original objective I am not added new objective. also English is my 3rd language so take that for consideration!