Compact way of getting file checksum in Perl

14,665

Solution 1

Here are three different ways depending on which modules you have available:

use Digest::MD5 qw(md5_hex);

use File::Slurp;
print md5_hex(read_file("filename")), "\n";

use IO::All;
print md5_hex(io("filename")->all), "\n";

use IO::File;
print md5_hex(do { local $/; IO::File->new("filename")->getline }), "\n";

Not completely one-line but pretty close.

Replace Digest::MD5 with any hash algorithm you want, e.g. SHA1.

IO::File is in core and should be available everywhere, but that's the solution I personally dislike the most. Anyway, it works.

Solution 2

I couldn't make any of the above work for me in windows, I would always get an incorrect MD5. I got suspicious that it was being caused by differences in linebreak, but converting the file to DOS or to unix made no difference. The same code with the same file would give me the right answer on linux and the wrong one in windows. Reading the documentation, I finally found something that would work both in windows and linux:

use Digest::MD5;
open ($fh, '<myfile.txt');
binmode ($fh);
print Digest::MD5->new->addfile($fh)->hexdigest;

I hope this helps other people having difficulty in windows, I find it so weird that I didn't find any mentions to problems on windows...

Solution 3

This also works:

use Digest::MD5 qw(md5_base64);
...
            open(HANDLE, "<", $dirItemPath);
            my $cksum = md5_base64(<HANDLE>);
            print "\nFile checksum = ".$cksum; 
Share:
14,665
amphibient
Author by

amphibient

Software Engineer with table manners

Updated on June 04, 2022

Comments

  • amphibient
    amphibient almost 2 years

    I am looking for ways to get file checksums in Perl but not by executing the system command cksum -- would like to do it in Perl itself because the script needs to be portable between UNIX and Windows. cksum <FILENAME> | awk '{ print $1 }' works on UNIX but obviously not in Windows. I have explored MD5 but it seems like getting a file handle is necessary and generally it doesn't seem like a very compact way to get that data (one-liner preferable).

    Is there a better way?