Getting a File's MD5 Checksum in Java

472,382

Solution 1

There's an input stream decorator, java.security.DigestInputStream, so that you can compute the digest while using the input stream as you normally would, instead of having to make an extra pass over the data.

MessageDigest md = MessageDigest.getInstance("MD5");
try (InputStream is = Files.newInputStream(Paths.get("file.txt"));
     DigestInputStream dis = new DigestInputStream(is, md)) 
{
  /* Read decorated stream (dis) to EOF as normal... */
}
byte[] digest = md.digest();

Solution 2

Use DigestUtils from Apache Commons Codec library:

try (InputStream is = Files.newInputStream(Paths.get("file.zip"))) {
    String md5 = org.apache.commons.codec.digest.DigestUtils.md5Hex(is);
}

Solution 3

There's an example at Real's Java-How-to using the MessageDigest class.

Check that page for examples using CRC32 and SHA-1 as well.

import java.io.*;
import java.security.MessageDigest;

public class MD5Checksum {

   public static byte[] createChecksum(String filename) throws Exception {
       InputStream fis =  new FileInputStream(filename);

       byte[] buffer = new byte[1024];
       MessageDigest complete = MessageDigest.getInstance("MD5");
       int numRead;

       do {
           numRead = fis.read(buffer);
           if (numRead > 0) {
               complete.update(buffer, 0, numRead);
           }
       } while (numRead != -1);

       fis.close();
       return complete.digest();
   }

   // see this How-to for a faster way to convert
   // a byte array to a HEX string
   public static String getMD5Checksum(String filename) throws Exception {
       byte[] b = createChecksum(filename);
       String result = "";

       for (int i=0; i < b.length; i++) {
           result += Integer.toString( ( b[i] & 0xff ) + 0x100, 16).substring( 1 );
       }
       return result;
   }

   public static void main(String args[]) {
       try {
           System.out.println(getMD5Checksum("apache-tomcat-5.5.17.exe"));
           // output :
           //  0bb2827c5eacf570b6064e24e0e6653b
           // ref :
           //  http://www.apache.org/dist/
           //          tomcat/tomcat-5/v5.5.17/bin
           //              /apache-tomcat-5.5.17.exe.MD5
           //  0bb2827c5eacf570b6064e24e0e6653b *apache-tomcat-5.5.17.exe
       }
       catch (Exception e) {
           e.printStackTrace();
       }
   }
}

Solution 4

The com.google.common.hash API offers:

  • A unified user-friendly API for all hash functions
  • Seedable 32- and 128-bit implementations of murmur3
  • md5(), sha1(), sha256(), sha512() adapters, change only one line of code to switch between these, and murmur.
  • goodFastHash(int bits), for when you don't care what algorithm you use
  • General utilities for HashCode instances, like combineOrdered / combineUnordered

Read the User Guide (IO Explained, Hashing Explained).

For your use-case Files.hash() computes and returns the digest value for a file.

For example a digest calculation (change SHA-1 to MD5 to get MD5 digest)

HashCode hc = Files.asByteSource(file).hash(Hashing.sha1());
"SHA-1: " + hc.toString();

Note that is much faster than , so use if you do not need a cryptographically secure checksum. Note also that should not be used to store passwords and the like since it is to easy to brute force, for passwords use , or instead.

For long term protection with hashes a Merkle signature scheme adds to the security and The Post Quantum Cryptography Study Group sponsored by the European Commission has recommended use of this cryptography for long term protection against quantum computers (ref).

Note that has a higher collision rate than the others.

Solution 5

Using nio2 (Java 7+) and no external libraries:

byte[] b = Files.readAllBytes(Paths.get("/path/to/file"));
byte[] hash = MessageDigest.getInstance("MD5").digest(b);

To compare the result with an expected checksum:

String expected = "2252290BC44BEAD16AA1BF89948472E8";
String actual = DatatypeConverter.printHexBinary(hash);
System.out.println(expected.equalsIgnoreCase(actual) ? "MATCH" : "NO MATCH");
Share:
472,382
Jack
Author by

Jack

I like emacs and open source stuff.

Updated on July 08, 2022

Comments

  • Jack
    Jack almost 2 years

    I am looking to use Java to get the MD5 checksum of a file. I was really surprised but I haven't been able to find anything that shows how to get the MD5 checksum of a file.

    How is it done?