Compile-time (preprocessor) hashing of string

15,446

Solution 1

With C++0x, this is possible as covered by answers in #1 and #2.

In C++03 there was no compile time string processing. With the preprocessor you can't seperate the string into tokens, with templates you can't access single characters. There was however a discussion on the speculated approach using C++0x.

What you could do for C++03 is to pass the string character-wise (possible using multi-character literals):

foo = hash<3DES, str<'a','b','c'> >::result;
// or:
foo = hash<3DES, str<'abc','def'> >::result;

... or simply do it as a pre-build step.

Solution 2

While this is not a proper answer to the question, see this blog entry for an example of a hash function for strings of up to 256 characters implemented purely as a C macro:

http://lolengine.net/blog/2011/12/20/cpp-constant-string-hash

Here is the actual code from the blog:

#include <string.h>
#include <stdint.h>
#include <stdio.h>

#define H1(s,i,x)   (x*65599u+(uint8_t)s[(i)<strlen(s)?strlen(s)-1-(i):strlen(s)])
#define H4(s,i,x)   H1(s,i,H1(s,i+1,H1(s,i+2,H1(s,i+3,x))))
#define H16(s,i,x)  H4(s,i,H4(s,i+4,H4(s,i+8,H4(s,i+12,x))))
#define H64(s,i,x)  H16(s,i,H16(s,i+16,H16(s,i+32,H16(s,i+48,x))))
#define H256(s,i,x) H64(s,i,H64(s,i+64,H64(s,i+128,H64(s,i+192,x))))

#define HASH(s)    ((uint32_t)(H256(s,0,0)^(H256(s,0,0)>>16)))

If you know ahead of time that you will only use it for static strings you could replace strlen() with sizeof().

Solution 3

This can be done with Boost.MPL but it might not be the type of hash you are after.

http://arcticinteractive.com/2009/04/18/compile-time-string-hashing-boost-mpl/

Solution 4

Even if this can't be (reasonably) done with the preprocessor, if you used a string literal or declared it as static const and did not create any lasting references to it the compiler will likely go ahead and do all of the math to generate the result and omit the string in the object file if you compile with optimizations. The hardest part of this is that you can't make the code to initialize a global or static variable too complicated or the compiler will say "hey, you! Don't you know you can't use a for loop outside of a function?".

Solution 5

I stumbled across a solution using the good 'ol C++ standard (I'm not sure what version it's considered, but let's just say this solution works in Visual Studio). Here's the link: link.

Also, here's a short version of a JSHash function using the above technique. The shown one here supports up to 4 characters, though you can add as many as you want.

template<const char A = 0, const char B = 0, const char C = 0, const char D = 0>
struct cHash
{
    template<const char C, size_t hash = 1315423911>
    struct HashCalc
    {
        enum { value = (C == 0) ? hash : hash ^ ((hash << 5) + C + (hash >> 2)) };
    };

    enum { value = HashCalc<D,HashCalc<C,HashCalc<B,HashCalc<A>::value>::value>::value>::value };
};

As noted, because this is a compile time hash, you can do something like this:

namespace Section
{
    enum Enum
    {
        Player = cHash<'p','l','a','y'>::value
    };
}

It's not the most elegant solution, so I plan on doing more research in this area, however as this is the only thing I've gotten to work in VisualStudio2010 I'm a little limited as far as my current project is concerned.

Share:
15,446
Ash
Author by

Ash

Updated on June 12, 2022

Comments

  • Ash
    Ash about 2 years

    Is there any way to create a hash of string at compile time using the C/C++ preprocessor (or even template-metaprogramming)?

    e.g. UNIQUE_SALT("HelloWord", 3DES);

    The idea is that HelloWorld will not be present in the compiled binary, just a hash.

    Edit: There are many of these declarations spread over a large codebase.

  • Ash
    Ash about 14 years
    Thanks, that about as close as it gets however I guess as per Georg's answer current C++ simply isn't capable. I guess other similar questions here on SO should have alerted me to that. If I could accept two answers I'd accept yours too.
  • Ash
    Ash about 14 years
    how does "hash<3DES, str<'abc','def'> >::result" work and why can't you do "hash<3DES, str<'verylong string','even longer string'> >::result"?
  • Georg Fritzsche
    Georg Fritzsche about 14 years
    @Ashirus: These are multicharacter literals, have type int and "implementation defined value". Longer values simply wouldn't fit into an int and you still have to get the seperate character values out manually.
  • Ash
    Ash about 14 years
    That's a good point, however I need to be certain that there are no references as there are many declarations and binary size is very important.
  • Elemental
    Elemental about 14 years
    I really don't think your compiler is going to optimise a hash algorithm out of existence even if all the data is const - they are clever but not even nearly that clever.
  • leetNightshade
    leetNightshade over 12 years
    Sorry for stirring up an old post, but yes it is possible in C++. C++0X makes the following method cleaner, but with meta template programming you can do it, albeit, character by character, as in my example below. While, as you pointed out, in C++0x you can process more characters at a time.
  • Georg Fritzsche
    Georg Fritzsche over 12 years
    @leet: What i already said above is that you can't have compile-time processing of strings or string literals, but that you can process lists of characters. That's exactly what you did and my example shows.
  • leetNightshade
    leetNightshade over 12 years
    @Georg Sorry, I misinterpreted your wording, I wasn't being quite so literal; you're right, as far as you can't process a "string", you can only process individual 'c','h','a','r','s'. At the time I was thinking in a general sense, not as a programmer, that it is possible to process strings, you just have to manually break it up for compile time processing.
  • Ian Ni-Lewis
    Ian Ni-Lewis about 8 years
    This answer is simply incorrect. Compile-time string literals can be parsed by functions declared constexpr. The trick is that you can't allow them to decay into pointers; you have to keep them as references to arrays of a defined size, which usually means making them template functions so the string size can be deduced.
  • Georg Fritzsche
    Georg Fritzsche about 8 years
    @IanNi-Lewis: Have you noticed the date of the answer? Feel free to suggest edits/updates/... instead.
  • Ian Ni-Lewis
    Ian Ni-Lewis about 8 years
    @GeorgFritzsche Not sure why the date would matter. Sure, the answer is two years old, but it's four years younger than this thread: ogre3d.org/forums/viewtopic.php?f=10&t=48471. The links are broken now, but the TL;DR is that you can parse string literals using recursive templates. I posted working code in an answer below.
  • Georg Fritzsche
    Georg Fritzsche about 8 years
    @IanNi-Lewis: This discussion was pre-C++0x and is 6 years old.