Lightweight 8 byte hash function algorithm

14,500

Solution 1

As AndrewTomazos-Fathomling said, it's impossible to do a secure hash in 64 bits, so if that's your intention then my advice is STOP, pick up a book and read about cryptographically secure hashing.

If you don't plan on using this as a secure hash and you do not care about collisions or attacks, then the answer he gave you works just fine and you can tweak the primes P1 and P2 as necessary. I will give you another alternative which allows you to do tagged hashing and mixes things up more.

// Disclaimer: I make no claims about the quality of this particular hash - it's 
// certainly not a cryptographically secure hash, nor should it *ever* be 
// construed as such. 

unsigned long long quickhash64(const char *str, unsigned long long mix = 0)
{ // set 'mix' to some value other than zero if you want a tagged hash          
    const unsigned long long mulp = 2654435789;

    mix ^= 104395301;

    while(*str)
        mix += (*str++ * mulp) ^ (mix >> 23);

    return mix ^ (mix << 37);
}

Solution 2

There is no chance to do a secure hash in 64 bits. Even SHA-1 at 160 bit is considered theoretically broken. You should use SHA2-256 if you really care about secure digital signing. If you don't care about security and just want a hash function that avoids non-adversarial collisions just use the following, it is fine:

constexpr uint64 P1 = 7;
constexpr uint64 P2 = 31;

uint64 hash = P1;
for (const char* p = s; *p != 0; p++) {
    hash = hash * P2 + *p;
}

Solution 3

I had the exact same requirement, and I settled for FNV-1A, after dismissing SIP hash (implemented by bloomberg here).

I found an FNV implementation here:

https://github.com/foonathan/string_id/blob/master/hash.hpp

which is:

constexpr uint64_t fnv_basis = 14695981039346656037ull;
constexpr uint64_t fnv_prime = 1099511628211ull;

// FNV-1a 64 bit hash of null terminated buffer
uint64_t fnv1a_hash(const char* str, uint64_t hash = fnv_basis)
{
    return *str ? fnv1a_hash(str + 1, (hash ^ *str) * fnv_prime) : hash;
}

It appears he is looping using tail recursion. And stop condition is the null byte.
(boost uses hash_range which is hash_combining each element in chain I guess.)

License is zlib and copyright is Jonathan Müller. Though I'm not convinced a oneliner can be legally licensed if it implements research by other persons (Fowler-Noll-Vo).

Share:
14,500
etuardu
Author by

etuardu

Now, alduce me to introlow myself. I'm sorry. Alself me to myduce introlow. Introme tolose mylow alself. Alme to you introself mylowduce. Excuse me a moment. #SOreadytohelp

Updated on June 08, 2022

Comments

  • etuardu
    etuardu about 2 years

    I need to extract an 8 byte digest from a variable length string so I'm looking for such an algorithm that I will implement in c/c++. That will be part of a digital signature procedure on a microcontroller, so it has to be:

    • writable in few lines of code, since the firmware has to be kept as little as possible;
    • low in resource consumption, expecially ram (preferably less than 100 bytes);
    • strong enough that changing a single character at any point of the string would change the overall digest.

    I took a look at existing algorithms such as crc64 but they seems to be too heavy for my platform.

  • ruakh
    ruakh over 11 years
    +1, though strlen is not a great variable-name in a C program. :-P
  • etuardu
    etuardu over 11 years
    Thank you for your answer, though this doesn't really fullfill my third point: mystring1 => 10000786a32ed, mystring2 => 10000786a32ee. I'd need something that can "propagate" a bit more the single character change through the hash.
  • Andrew Tomazos
    Andrew Tomazos over 11 years
    You want what is called the "avalanche effect", but ask yourself why you want this. It really only makes sense in the context of secure hashing, and with only 64bits it will never be secure against a brute force attack. You can get more flipped bits by using two larger primes for P1 and P2, but as I said there is no point.
  • Nik Bougalis
    Nik Bougalis over 11 years
    Why, exactly, do you need changes to avalanche faster? As @AndrewTomazos-Fathomling said, it is starting to sound as if you want to do "secure hashing" but with only 64-bits it's just not going to work.
  • etuardu
    etuardu over 11 years
    I liked this one better because it keeps the sensitiveness against all chars of the string, other ones that used shifts loses the influence of the first chars of the string after certain lenghts. By the way I see that it could be shortened like, for example, uint64_t mix, mulp = 2654435789; while(*str) mix^=mulp**str++;.