Populate An Array Using Constexpr at Compile-time

19,866

Solution 1

Ignoring ALL the issues, indices to the rescue:

template<unsigned... Is> struct seq{};
template<unsigned N, unsigned... Is>
struct gen_seq : gen_seq<N-1, N-1, Is...>{};
template<unsigned... Is>
struct gen_seq<0, Is...> : seq<Is...>{};

template<unsigned... Is>
constexpr Table MagicFunction(seq<Is...>){
  return {{ whichCategory(Is)... }};
}

constexpr Table MagicFunction(){
  return MagicFunction(gen_seq<128>{});
}

Live example.

Solution 2

In C++17 ::std::array has been updated to be more constexpr friendly and you can do the same as in C++14, but without some of the scary looking hacks to get around the lack of constexpr in crucial places. Here is what the code would look like there:

#include <array>

enum Type {
    Alphabet,
    Number,
    Symbol,
    Other,
};

constexpr ::std::array<Type, 128> MagicFunction()
{
   using result_t = ::std::array<Type, 128>;
   result_t result = {Other};
   result[65] = Alphabet;
   //....
   return result;
}

const ::std::array<Type, 128> table = MagicFunction();

Again MagicFunction still needs to obey the rather loose constexpr rules. Mainly, it may not modify any global variables or use new (which implies modifying global state, namely the heap) or other such things.

Solution 3

IMHO the best way to do this is simply write a tiny setup program that will generate table for you. And then you can either throw out the setup program, or check it in alongside the generated source code.

The tricky part of this question is just a duplicate of this other one: Is it possible to create and initialize an array of values using template metaprogramming?

The trick is, it's impossible to write anything like

Type table[256] = some_expression();

at file scope, because global arrays can be initialized only with literal (source-level) initializer-lists. You can't initialize a global array with the result of a constexpr function, even if you could somehow get that function to return a std::initializer_list, which you can't because its constructor isn't declared constexpr.

So what you have to do is get the compiler to generate the array for you, by making it a static const data member of a template class. After one or two levels of metaprogramming that I'm too confused to write out, you'll bottom out in a line that looks something like

template <int... Indices>
Type DummyStruct<Indices...>::table[] = { whichCategory(Indices)... };

where Indices is a parameter-pack that looks like 0,1,2,... 254,255. You construct that parameter-pack using a recursive helper template, or maybe just using something out of Boost. And then you can write

constexpr Type (&table)[] = IndexHelperTemplate<256>::table;

...But why would you do all that, when the table is only 256 entries that will never change unless ASCII itself changes? The right way is the simplest way: precompute all 256 entries and write out the table explicitly, with no templates, constexpr, or any other magic.

Solution 4

The way to do this in C++14 looks like this:

#include <array>

enum Type {
    Alphabet,
    Number,
    Symbol,
    Other,
};

constexpr ::std::array<Type, 128> MagicFunction()
{
   using result_t = ::std::array<Type, 128>;
   result_t result = {Other};
   const result_t &fake_const_result = result;
   const_cast<result_t::reference>(fake_const_result[65]) = Alphabet;
   //....
   return result;
}

const ::std::array<Type, 128> table = MagicFunction();

No clever template hackery required any longer. Though, because C++14 didn't really undergo a thorough enough review of what did and didn't have to be constexpr in the standard library, a horrible hack involving const_cast has to be used.

And, of course, MagicFunction had better not modify any global variables or otherwise violate the constexpr rules. But those rules are pretty liberal nowadays. You can, for example, modify all the local variables you want, though passing them by reference or taking their addresses may not work out so well.

See my other answer for C++17, which allows you to drop some of the ugly-looking hacks.

Share:
19,866

Related videos on Youtube

Jimmy Lu
Author by

Jimmy Lu

Hello, I’m Jimmy Lu, a student studying software engineering at the University of Waterloo. I love reading and learning all sorts of interesting programming books and articles. My two favorite topics are low level optimization in C and template meta-programming in C++. My favorite book is The C++ Programming Language. My projects are all hosted on github and I also maintain a blog. You can also view my resume at my website.

Updated on June 04, 2022

Comments

  • Jimmy Lu
    Jimmy Lu almost 2 years

    I would like to populate an array of enum using constexpr. The content of the array follows a certain pattern.

    I have an enum separating ASCII character set into four categories.

    enum Type {
        Alphabet,
        Number,
        Symbol,
        Other,
    };
    
    constexpr Type table[128] = /* blah blah */;
    

    I would like to have an array of 128 Type. They can be in a structure. The index of the array will be corresponding to the ASCII characters and the value will be the Type of each character.

    So I can query this array to find out which category an ASCII character belongs to. Something like

    char c = RandomFunction();
    if (table[c] == Alphabet) 
        DoSomething();
    

    I would like to know if this is possible without some lengthy macro hacks.

    Currently, I initialize the table by doing the following.

    constexpr bool IsAlphabet (char c) {
        return ((c >= 0x41 && c <= 0x5A) ||
                (c >= 0x61 && c <= 0x7A));
    }
    
    constexpr bool IsNumber (char c) { /* blah blah */ }
    
    constexpr bool IsSymbol (char c) { /* blah blah */ }
    
    constexpr Type whichCategory (char c) { /* blah blah */ }
    
    constexpr Type table[128] = { INITIALIZE };
    

    where INITIALIZE is the entry point of some very lengthy macro hacks. Something like

    #define INITIALIZE INIT(0)
    #define INIT(N) INIT_##N
    #define INIT_0 whichCategory(0), INIT_1
    #define INIT_1 whichCategory(1), INIT_2
    //...
    #define INIT_127 whichCategory(127)
    

    I would like a way to populate this array or a structure containing the array without the need for this macro hack...

    Maybe something like

    struct Table {
        Type _[128];
    };
    
    constexpr Table table = MagicFunction();
    

    So, the question is how to write this MagicFunction?

    Note: I am aware of cctype and likes, this question is more of a Is this possible? rather than Is this the best way to do it?.

    Any help would be appreciated.

    Thanks,

    • Xeo
      Xeo over 11 years
      You do know that ASCII only ranges [0 .. 127]? And that char's signedness is implementation defined? Your current approach is very dangerous. Oh, and last but not least, the C++ standard doesn't demand ASCII encoding at all. It might aswell be EBCDIC.
    • Matthieu M.
      Matthieu M. over 11 years
      The good news is that because arrays can be initialized with pack expansions, what you ask for is indeed feasible. You just need to invoke the function plenty of times :p
    • Happy Green Kid Naps
      Happy Green Kid Naps over 11 years
      Most likely not possible because C++ doesn't require ASCII representation for characters. Also, in the strictest sense, ASCII character set comprises of only 128 characters.
    • Jimmy Lu
      Jimmy Lu over 11 years
      @MatthieuM. can you elaborate a bit? :) thx
    • Admin
      Admin over 11 years
      I know this does not answer your question directly, but what's wrong with using functions from cctype? cplusplus.com/reference/clibrary/cctype
    • dyp
      dyp over 11 years
      There's already something similar (though not entirely equivalent) in the Standard library: the ctype<char> facet of a locale (and its classic_table). It's not guaranteed to be ASCII (as Xeo pointed out) and is not constexpr.
    • dyp
      dyp over 11 years
      @Xeo Why is it significant here that C++ does not require any specific encoding? A char must be at least 8 bits wide (§1.7/1) and therefore can contain ASCII. I see no char literals in the OP's question. (I'm not completely sure about the signed/unsigned comparison, see §5/10 bullet 5 subbullet 3/4.)
    • Xeo
      Xeo over 11 years
      @DyP: ((c >= 0x41 && c <= 0x5A) || (c >= 0x61 && c <= 0x7A)) from IsAlphabet -- this assumes the decimal ordering that is present in ASCII. The signedness is important since OP passes literals > 127, which may map to negative chars.
    • Jimmy Lu
      Jimmy Lu over 11 years
      The content or the uses of the array here really does not matter much here. For as far as I care, it doesn't even need to be a map between categories and ASCII characters. I just would like to see an efficient implementation to populate an array, which can be used as some sort of state machine, at compile time.
    • dyp
      dyp over 11 years
      @Xeo If c has been initialized by c = 0x41; // A why shouldn't this work? I only see problems with c = 'A';.
    • Jimmy Lu
      Jimmy Lu over 11 years
      @Xeo, my questions is more of a is it possible instead of is this the best way to do it. Thanks for your pointing out the range for ASCII though. I always group ASCII and extended ASCII together but I guess not. I was also not aware that C++ does not require ASCII encoding.
    • Matthieu M.
      Matthieu M. over 11 years
      @BeyondSora: I'm sorry, I did not have the time to properly address this question yesterday. Thankfully Xeo did (despite his grumpiness :p). The indices generation combined with pack expansion is a great trick to generate all kind of initialization lists (as you can see here), and thus I was pointing out at the fact that since arrays accept initialization lists, you were golden.
    • Xeo
      Xeo over 11 years
      @DyP: With EBCDIC, 0x41 isn't mapped to any symbol at all, and letters have a strange place in the codepage. Just see here.
  • Xeo
    Xeo over 11 years
    @Steven: Added a link to the Lounge<C++> wiki entry. It basically builds a list [0 .. 127] and expands that, calling whichCategory(0), whichCategory(1), ..., whichCategory(127) and passes that as initializer arguments for list-initialization to Table._ (notice the double {} for initializing the inner array).
  • Jimmy Lu
    Jimmy Lu over 11 years
    I didn't know you can return something of this form { /*...*/ }, is this something new in C++11 or always there in the standard?
  • Xeo
    Xeo over 11 years
    @BeyondSora: New to C++11, called list-initialization (also less formal and more commonly known as uniform initialization).
  • Yola
    Yola over 7 years
    Why do you use double curly brackets?
  • Omnifarious
    Omnifarious almost 7 years
    This answer is obsolete in C++14 because C++14 allows you to build the whole array inside a constexpr function.
  • aschepler
    aschepler almost 7 years
    You can even use a for loop inside MagicFunction.
  • aschepler
    aschepler almost 7 years
    I get an error about result being declared uninitialized, though. std::array<Type, 128> result{}; does work - I guess Type{} is Alphabet.
  • Omnifarious
    Omnifarious almost 7 years
    @aschepler oops, I'll fix that.
  • tuket
    tuket over 6 years
    What does the :: at the beginning?
  • Omnifarious
    Omnifarious over 6 years
    @user1754322 - See this question: stackoverflow.com/questions/1661912/…
  • Tim Rae
    Tim Rae over 6 years
    Ummm... This does not compile with C++14. You need a constexpr reference operator[]( size_type pos ); which only exists in C++17: en.cppreference.com/w/cpp/container/array/operator_at
  • Omnifarious
    Omnifarious over 6 years
    @TimRae - I thought I'd passed the C++14 flag to the compiler when I tested that. :-/ Perhaps the standard library implementation I was using did not properly remove functions when the version was set to C++14. Oh, well. I updated my answer to reflect the need for C++17. One could, of course, do something like *(result.data() + index) = Other;, but that would be horrible.
  • Omnifarious
    Omnifarious over 6 years
    @TimRae - Oh-ho! The .data function isn't constexpr either! sigh Well, I'll update the answer with two versions, a C++14 and a C++17 version.
  • lisyarus
    lisyarus over 6 years
    NB: seq and gen_seq are in the standard library since C++14, called std::index_sequence and std::make_index_sequence.
  • Tim Rae
    Tim Rae over 6 years
    C++17 is official now and supported by gcc 7 and clang 6, so I think you could make it more prominent in your answer with an actual code block.
  • Tim Rae
    Tim Rae over 6 years
    Or even better just make a second answer
  • Omnifarious
    Omnifarious over 6 years
    @TimRae - Done. :-)
  • Ela782
    Ela782 over 5 years
    Why do you need all the const stuff? Are you not done after the line result_t result = {Other};? The lines after that also don't seem used, since what is returned is return result;, and not the fake_const_result?
  • Omnifarious
    Omnifarious over 5 years
    @Ela782 - If you notice, I have a C++17 answer that does exactly that. The problem is that the C++14 standard neglected to mark the non-const version of operator [] for std::array as constexpr. So, I can't use it inside of a constexpr function.
  • Isaac Pascual
    Isaac Pascual about 5 years
    IMHO hardcode it is bad because its important to understand the logic on that values if that was complex (in a more generic case). If there is n pregenerated factorial values who ensure my that there is no copypaste mistake nor error on some value?
  • Quuxplusone
    Quuxplusone about 5 years
    If there are n lines of complicated metaprogramming to compute the values, who ensures that there is no copy-paste mistake or bug in that code? Our goal is always to reduce the possibility of error. For something like a three-line factorial function, maybe you'd minimize the possibility of error by writing code instead of data. For OP's actual problem — a 128-byte classification table — I still feel strongly that writing data instead of code is the best way to minimize the possibility of error. (But note that constexpr programming has gotten MUCH more powerful and natural since 2012.)
  • Isaac Pascual
    Isaac Pascual about 5 years
    I prefer to see what is behind the data than hardcoded numbers.
  • Omnifarious
    Omnifarious almost 5 years
    @lisyarus - Why would you use C++14 specific features to solve a problem in a particular way when there is a much more compile-time (as in time taken to compile) and easier to understand way to solve the same problem that was also introduced in C++14?
  • lisyarus
    lisyarus almost 5 years
    @Omnifarious I'm afraid I don't quite get your question. I'm just saying that part of the proposed solution can be replaced by using analogues from the standard library, decreasing the amount of code and increasing readability.
  • Omnifarious
    Omnifarious almost 5 years
    @lisyarus - Except, if you had a version of C++ new enough to have the features you mention, you wouldn't make a solution that looked like that. Because that version of C++ has other features that make solving this problem easier.
  • lisyarus
    lisyarus almost 5 years
    @Omnifarious Look, the question was asked in 2012, and was given a good 2012-ish answer. If you feel that the answer should be updated/removed/etc, contact the author of the answer. Again, the only thing I am saying is that C++14 has std::make_index_sequence. I am not saying that this answer is still the best solution, that your answer is bad, or anything like that.
  • Omnifarious
    Omnifarious almost 5 years
    @lisyarus - I think the answer is great. It answers the question for C++11 and it's a fine answer. Your update to the answer is pointless. This answer doesn't need a tweak to use C++14 stuff in C++14. It needs to be completely re-written for C++14 to do things in a different way. So, if someone is using C++14s std::make_index_sequence to initialize a ::std::array at compile time, they're doing something wrong, C++14 has a much better way to initialize an ::std::array at compile time. But the answer itself is great. Your suggestion for an improvement is not.
  • lisyarus
    lisyarus almost 5 years
    @Omnifarious Thank you for sharing your opinion; you are absolutely free in your desire to consider my comment bad, useless, pointless, whatever. There is a special "flag" feature for comments: you may use that. However, from my perspective the comment is still a valuable one. If someone has already been using this solution, there may be no reason to rewrite it completely just for the sake of using new C++ features. On the other hand, simplifying it slightly and effortlessly may be of interest. The intent of my comment was just providing information. I now digress from further discussion.
  • Omnifarious
    Omnifarious almost 5 years
    @lisyarus - I only flag comments that are actively abusive of people. You said: If someone has already been using this solution, there may be no reason to rewrite it completely just for the sake of using new C++ features. and I think that's a very valid point. Thank you.
  • mxmlnkn
    mxmlnkn over 3 years
    I came here trying to fill a std::array<std::pair<X,Y>> like this with result[65] = std::make_pair( x,y ). This won't work, but result[65].first = x; result[65].second = y; will.