C# Create a hash for a byte array or image

46,278

Solution 1

There's plenty of hashsum providers in .NET which create cryptographic hashes - which satisifies your condition that they are unique (for most purposes collision-proof). They are all extremely fast and the hashing definitely won't be the bottleneck in your app unless you're doing it a trillion times over.

Personally I like SHA1:

public static string GetHashSHA1(this byte[] data)
{
    using (var sha1 = new System.Security.Cryptography.SHA1CryptoServiceProvider())
    {
        return string.Concat(sha1.ComputeHash(data).Select(x => x.ToString("X2")));
    }
}

Even when people say one method might be slower than another, it's all in relative terms. A program dealing with images definitely won't notice the microsecond process of generating a hashsum.

And regarding collisions, for most purposes this is also irrelevant. Even "obsolete" methods like MD5 are still highly useful in most situations. Only recommend not using it when the security of your system relies on preventing collisions.

Solution 2

The part of Rex M's answer about using SHA1 to generate a hash is a good one (MD5 is also a popular option). zvolkov's suggestion about not constantly creating new crypto providers is also a good one (as is the suggestion about using CRC if speed is more important than virtually-guaranteed uniqueness.

However, do not use Encoding.UTF8.GetString() to convert a byte[] into a string (unless of course you know from context that it is valid UTF8). For one, it will reject invalid surogates. A method guaranteed to always give you a valid string from a byte[] is Convert.ToBase64String().

Solution 3

Creating new instance of SHA1CryptoServiceProvider every time you need to compute a hash is NOT fast at all. Using the same instance is pretty fast.

Still I'd rather do one of the many CRC algorithms instead of a cryptographic hash as hash functions designed for cryptography don't work too well for very small hash sizes (32 bit) which is what you want for your GetHash() override (assuming that's what you want).

Check this link out for one example of computing CRC in C#: http://sanity-free.org/134/standard_crc_16_in_csharp.html

P.S. the reason you want your hash to be small (16 or 32 bit) is so you can compare them FAST (that was the whole point of having hashes, remember?). Having hash represented by a 256-bit long value encoded as string is pretty insane in terms of performance.

Solution 4

You can use any of the standard hashing algorithms, but hashing can't technically guarantee uniqueness. Hashing is designed to be a relatively fast and/or small token to be able to see if one piece of data likely is the same as the other. It's fully possible for entirely different sets of data to produce the same hash, though being able to produce these algorithmically is very hard.

All of that aside, for checking likely identity, MD5 is fairly fast. SHA is more reliable (MD5 has been hacked, so shouldn't be use for security), but it's also slower.

Share:
46,278
johnc
Author by

johnc

A former Java, Cold Fusion, VB, Infopump, OpenROAD, asp, jsp, php, javascript, actionscript, C, C++ developer. I now pretty much stick to C#, nodejs and Java for Android, but explore off the beaten track when required.

Updated on July 09, 2022

Comments

  • johnc
    johnc almost 2 years

    Possible Duplicate:
    How do I generate a hashcode from a byte array in c#

    In C#, I need to create a Hash of an image to ensure it is unique in storage.

    I can easily convert it to a byte array, but unsure how to proceed from there.

    Are there any classes in the .NET framework that can assist me, or is anyone aware of some efficient algorithms to create such a unique hash?