C/C++: switch for non-integers

39,273

Solution 1

Using some nasty macro and template magic it's possible to get an unrolled binary search at compiletime with pretty syntax -- but the MATCHES ("case") have to be sorted: fastmatch.h

NEWMATCH
MATCH("asd")
  some c++ code
MATCH("bqr")
  ... the buffer for the match is in _buf
MATCH("zzz")
  ...  user.YOURSTUFF 
/*ELSE 
  optional
*/
ENDMATCH(xy_match)

This will generate (roughly) a function bool xy_match(char *&_buf,T &user), so it must be at the outer level. Call it e.g. with:

xy_match("bqr",youruserdata);

And the breaks are implicit, you cannot fall-thru. It's also not heavily documented, sorry. But you'll find, that there are some more usage-possibilities, have a look. NOTE: Only tested with g++.

Update C++11:

Lambdas and initializer list make things much prettier (no macros involved!):

#include <utility>
#include <algorithm>
#include <initializer_list>

template <typename KeyType,typename FunPtrType,typename Comp>
void Switch(const KeyType &value,std::initializer_list<std::pair<const KeyType,FunPtrType>> sws,Comp comp) {
  typedef std::pair<const KeyType &,FunPtrType> KVT;
  auto cmp=[&comp](const KVT &a,const KVT &b){ return comp(a.first,b.first); };
  auto val=KVT(value,FunPtrType());
  auto r=std::lower_bound(sws.begin(),sws.end(),val,cmp);
  if ( (r!=sws.end())&&(!cmp(val,*r)) ) {
    r->second();
  } // else: not found
}

#include <string.h>
#include <stdio.h>
int main()
{
  Switch<const char *,void (*)()>("ger",{ // sorted:                      
    {"asdf",[]{ printf("0\n"); }},
    {"bde",[]{ printf("1\n"); }},
    {"ger",[]{ printf("2\n"); }}
  },[](const char *a,const char *b){ return strcmp(a,b)<0;});           
  return 0;
}

That's the idea. A more complete implementation can be found here: switch.hpp.

Update 2016: Compile time trie

My newest take on this problem uses advanced c++11 metaprogramming to generate a search-trie at compile time. Unlike the previous approaches, this will handle unsorted case-branches/strings just fine; they only have to be string-literals. G++ also allows constexpr for them, but not clang (as of HEAD 3.9.0 / trunk 274233).

In each trie node a switch-statement is utilized to harness the compiler's advanced code generator.

The full implementation is available at github: smilingthax/cttrie.

Solution 2

In C++, you can obtain O(lg n) performance by having a std::map<std::string, functionPointerType>. (In C you could implement what was essentially the same but it would be more difficult) Pull out the right function pointer using std::map<k, v>::find, and call that pointer. Of course, that's not going to be nearly as simple as a language supported switch statement. On the other hand, if you have enough items that there's going to be a huge difference between O(n) and O(lg n), that's probably an indication that you should be going for a different design in the first place.

Personally, I've always found the ELSEIF chain more readable anyway.

Solution 3

You can achieve it without using any map or unordered_map like below. Compare first character alone to identify which string. If more than one match, then you can fallback to if/else chain within that case statement. Number of comparisons will be greatly reduced if not many strings starting with same letter.

char *str = "foo";
switch(*str)
{
case 'f':
    //do something for foo
    cout<<"Foo";
    break;
case 'b':
    //do something for bar
    break;
case 'c':
    if(strcmp(str, "cat") == 0)
    {
        //do something for cat
    }
    else if(strcmp(str, "camel") == 0)
    {
        //do something for camel
    }
}

This looks to be optimal solution without any cost, even though its not standard.

Solution 4

Use an if...else block. You don't really have a compelling reason not to, aside from it not being pretty to look at, and the if...else block is the mostr straightforward solution.

Everything else requires additional code which as say say increases complexity. And it just moves the ugliness to elsewhere. But at some level, a string compare still has to happen. Now you've just covered it up with more code.

You might gain some performance increases by using a map or a hash map, but youcan also gain similar if not better gains by simply choosing a smart order to evaluate your if...else blocks. And switching to a map for performance reasons is really just premature micro-optimization.

Solution 5

In C, there are two common solutions. The first is to keep your keywords in a sorted array, say

typedef struct Keyword {
    const char *word;
    int         sub;
    int         type;
} Keyword;

Keyword keywords[] ={   /* keep sorted: binary searched */
    { "BEGIN", XBEGIN, XBEGIN },
    { "END",   XEND,   XEND },
    { "NF",    VARNF,  VARNF },
    { "atan2", FATAN,  BLTIN },
    ...
};

and do a binary search on them. The previous is straight from the source code of awk by C grandmaster Brian W. Kernighan.

The other solution, which is O(min(m, n)) if n is the length of your input string and m the length of the longest keyword, is to use a finite-state solution such as a Lex program.

Share:
39,273
peoro
Author by

peoro

Apparently, this user has nothing to say.

Updated on July 09, 2022

Comments

  • peoro
    peoro almost 2 years

    Often I need to choose what to do according to the value of a non-POD constant element, something like this:

    switch( str ) {
      case "foo": ...
      case "bar": ...
      default:    ...
    }
    

    Sadly switch can only be used with integers: error: switch quantity not an integer.

    The most trivial way to implement such thing is then to have is a sequence of ifs:

    if( str == "foo" )      ...
    else if( str == "bar" ) ...
    else                    ...
    

    But this solution looks dirty and should cost O(n), where n is the number of cases, while that piece of code could cost O(log n) at the worst case with a binary search.

    Using some data structs (like Maps) it could be possible to obtain an integer representing the string ( O(log n) ), and then use an O(1) switch, or one could implement a static binary sort by nesting ifs in the right way, but still these hacks would require a lot of coding, making everything more complex and harder to maintain.

    What's the best way to do this? (fast, clean and simple, as the switch statement is)

  • BЈовић
    BЈовић over 13 years
    +1 for suggesting std::map, but elseif is IMO bad for long list of items
  • Billy ONeal
    Billy ONeal over 13 years
    @VJo: If your list of items is long enough you should replace it with some form of polymorphic behavior anyway.
  • Billy ONeal
    Billy ONeal over 13 years
    +1 for "switching to a map for performance reasons is really just premature micro-optimization". The main reason I suggest using something like a std::map in places like this is that it helps you maintain OCP.
  • Billy ONeal
    Billy ONeal over 13 years
    +1 for codifying what I said earlier. Note that you're missing the return type on main (which is required in C++, or at least will be in C++0x, can't remember exactly atm)
  • Steve Jessop
    Steve Jessop over 13 years
    "at some level, a string compare still has to happen" - apparently the questioner is only comparing pointers, which is odd.
  • Fred Foo
    Fred Foo over 13 years
    In C you can use a sorted list and bsearch. The original Awk implementation does this for keywords. In C++, an unordered_map does it in O(n).
  • Billy ONeal
    Billy ONeal over 13 years
    I believe the OP understands all of that. The compiler could still make the switch O(lg n) w.r.t. the number of strings.
  • John Dibling
    John Dibling over 13 years
    @Steve: Really? If so, then he should be able to use a switch.
  • Billy ONeal
    Billy ONeal over 13 years
    @larsmans: That would be a good C solution. Perhaps an answer we can upvote? :) Though : unordered_map is not yet standard, which prevents me from using it in a recommendation (If I won't use it, I don't see how I could ask others to use it), and of course using bsearch requires that your strings already be sorted.
  • Steve Jessop
    Steve Jessop over 13 years
    @John: by casting to uintptr_t or some such, you mean? I'm pretty sure it's an error on the part of the questioner, actually, since there's no guarantee the "same" string literal in different places will have the same address. Just thought I'd mention that the "if/else" code in the question doesn't do string comparisons.
  • BЈовић
    BЈовић over 13 years
    @Billy Not necessarily. For example, if you want to map strings to some constant integer values. You can initialize the map using boost::assign::map_list_of
  • stefaanv
    stefaanv over 13 years
    besides, with if-else, you can still optimize a bit by starting with the most expected possibility.
  • Steve Jessop
    Steve Jessop over 13 years
    @John: oh, no, I take it all back. I missed that the question is about "C/C++", but actually not C at all. So str is a std::string, and you were right.
  • Billy ONeal
    Billy ONeal over 13 years
    +1 for an interesting library solution that closely approximates what the OP wants.
  • Steve Jessop
    Steve Jessop over 13 years
    I initially thought what you thought, but the question specifies that str is non-POD. So it's a std::string, not a pointer. In this case "C/C++" actually means "C++", not either of the more-common meanings "C or C++", and "C that also compiles as C++".
  • John Dibling
    John Dibling over 13 years
    @stefaanv: Right, that's what I meant by "youcan also gain similar if not better gains by simply choosing a smart order to evaluate your if...else blocks"
  • John Dibling
    John Dibling over 13 years
    Interestng. I wonder how this can be developed to be a more robust slution not using macros. Good starting point to a new approach.
  • smilingthax
    smilingthax over 13 years
    @John: I don't think it's possible w/o macros, as I have to use two templates (template-specializations) per MATCH to calculate the binary search at compile time. It was the closest I could come to a 'switch'. Even the unrolled-ness is unavoidable, dictated by the way template-metaprogramming works. I didn't even believe this would be possible in C++ -- until I got it to compile and saw it work.
  • Fred Foo
    Fred Foo over 13 years
    Ugh. Removing the extraneous {} would make this is a bit more readable, but I'd switch to something like Lex for this.
  • Billy ONeal
    Billy ONeal over 13 years
    @larsmans: Now if only (f)lex would support Unicode, then all would be good ;)
  • Ziffusion
    Ziffusion over 13 years
    +1 This mimics for non-integer values what the compiler does for integer values. The compiler resorts to jump tables or static binary trees to optimize case lookup (and subsequent branch). But all said, if we are going to do things at runtime to enable this abstraction, then a dictionary, would seem like, is the answer. I mean, OP should not get hung up on making the syntax look similar to a switch statement. A dictionary gives him exactly the same thing at runtime what a switch does for integers.
  • Alexandre C.
    Alexandre C. over 13 years
    Why do you say it is not standard ?
  • bjskishore123
    bjskishore123 over 13 years
    @Alexandre: I mean, I used strcmp with c-style string. If we use, std::string instead, then it would be optimal c++ soluiton.
  • Nim
    Nim over 13 years
    @larsmans, I did say it was nasty! ;) anyways, it is pseudocode, so needs to be done properly...
  • rubenvb
    rubenvb over 13 years
    This is actually a pretty nice solution... :)
  • peoro
    peoro over 13 years
    Well, this will cost O(n). Of course creating a map will cost so much more (especially if you're using it for three cases like in your example), but it will be worth it, if you've got many cases and run the switch tons of times.
  • Wade
    Wade almost 12 years
    There is a much better way without macros, in fact: generate a perfect hash function using gperf. gnu.org/software/gperf
  • Jarosław Bielawski
    Jarosław Bielawski over 10 years
    Not portable. The byte order of multi-char constants is undefined and has no link to the endianess of the platform .
  • Jarosław Bielawski
    Jarosław Bielawski over 10 years
    on x86 on windows on this implementation of the compiler. I had the case that it changed between two version of the same compiler (gcc v3 vs gcc v4 on Solaris-SPARC).
  • Billy ONeal
    Billy ONeal over 9 years
    Have you benchmarked this? It looks like this does nothing but double the number of times the first character needs to be fetched and blow your code size up like crazy.
  • Walter
    Walter almost 9 years
    Easily the best answer here.
  • blabla999
    blabla999 over 8 years
    @Billy: first: the 1st character will be in a cacheline anyway, so that fetch comes almost for free. second: how about: case 'c': if (strcmp(str+1, "at"))... if (strcmp(str+1, "amel")" if you really do care for that single fetch?
  • Billy ONeal
    Billy ONeal over 8 years
    @blabla999: No, I don't care about the single fetch. I care that you've made the code much more complex and much more likely to be buggy for little/no performance gain.
  • blabla999
    blabla999 over 8 years
    @Billy: for one: it was you who brought up the "have you benchmarked" issue. Second, it may make a difference if called many times, as it avoids the call to strcmp (for example in a heavily called inner loop). 3rd, this could be wrapped into a macro to make the code more maintainable/readable; 4th: this was more of a "sarcastic" answer to your "doubles the number of fetches" argument, than a suggestion for a production program. But fair enough - no flame war here.
  • Billy ONeal
    Billy ONeal over 8 years
    @bla: Yes, I brought up the "have you benchmarked" point because, as I said, this code is now much more difficult to read and much more likely to be buggy, and I question the claimed performance gain. 2. I don't believe you without showing this is actually true. The optimizer knows what strcmp is. 3. I doubt it. 4. I never made a "double the number of fetches" argument.
  • bjskishore123
    bjskishore123 over 8 years
    @Billy: If there are many words that start with same character, readability can be improved by comparing words in a separate function. I didn't benchmark performance aspect. In real world, few utility functions get called several million times. If the data set is huge, if many words start with same character, and if huge number of calls are required, there will be improvement in performance with this approach. However, this kind of code is not required, if performance is not critical.
  • Spongman
    Spongman about 3 years
    "premature micro-optimization" isn't the whole fact that switch is const-only a premature micro-optimization?
  • John Dibling
    John Dibling almost 3 years
    @Spongman: Not if you consider that one of the primary benefits of using a switch is code readability.