C/C++ switch case on char arrays

14,996

Solution 1

Disclaimer: Don't use this except for fun or learning purposes. For serious code, use common idioms, never rely on compiler specific behaviour in the general case; if done anyway, incompatible platforms should trigger a compile time error or use the good, general code.


It seems the standard allows multi-character character constants as per the grammar. Haven't checked yet whether the following is really legal though.

~/$ cat main.cc

#include <iostream>

#ifdef I_AM_CERTAIN_THAT_MY_PLATFORM_SUPPORTS_THIS_CRAP
int main () {
    const char *foo = "fooo";
    switch ((foo[0]<<24) | (foo[1]<<16) | (foo[2]<<8) | (foo[3]<<0)) {
    case 'fooo': std::cout << "fooo!\n";  break;
    default:     std::cout << "bwaah!\n"; break;
    };
}
#else
#error oh oh oh
#endif

~/$ g++ -Wall -Wextra main.cc  &&  ./a.out
main.cc:5:10: warning: multi-character character constant
fooo!

edit: Oh look, directly below the grammar excerpt there is 2.13.2 Character Literals, Bullet 1:

[...] An ordinary character literal that contains more than one c-char is a multicharacter literal. A multicharac- ter literal has type int and implementation-defined value.

But in the second bullet:

[...] The value of a wide-character literal containing multiple c-chars is implementation-defined.

So be careful.

Solution 2

Follow the exact method employed in video encoding with FourCC codes:

Set a FourCC value in C++

#define FOURCC(a,b,c,d) ( (uint32) (((d)<<24) | ((c)<<16) | ((b)<<8) | (a)) )

Probably a good idea to use enumerated types or macros for each identifier:

enum {
    ID_SQRT = FOURCC( 's', 'q', 'r', 't'),
    ID_LOG2 = FOURCC( 'l', 'o', 'g', '2')
};

int structure_id = FOURCC( structure->id[0], 
                           structure->id[1],
                           structure->id[2],
                           structure->id[3] );
switch (structure_id) {
case ID_SQRT: ...
case ID_LOG2: ...
}

Solution 3

The issue is that the case branches of a switch expect a constant value. In particular, a constant which is known at compile time. The address of strings isn't known at compile time - the linker knows the address, but not even the final address. I think the final, relocated, address is only available at runtime.

You can simplify your problem to

void f() {
    int x[*(int*)"x"];
}

This yields the same error, since the address of the "x" literal is not known at compile time. This is different from e.g.

void f() {
    int x[sizeof("x")];
}

Since the compiler knows the size of the pointer (4 bytes in 32bit builds).

Now, how to fix your problem? Two things come to my mind:

  1. Don't make the id field a string but an integer and then use a list of constants in your case statements.

  2. I suspect that you will need to do a switch like this in multiple places, so my other suggestion is: don't use a switch in the first place to execute code depending on the type of the structure. Instead, the structure could offer a function pointer which can be called to do the right printf call. At the time the struct is created, the function pointer is set to the correct function.

Here's a code sketch illustrating the second idea:

struct MyStructure {
   const char *id;
   void (*printType)(struct MyStructure *, void);
   void (*doThat)(struct MyStructure *, int arg, int arg);
   /* ... */
};

static void printSqrtType( struct MyStructure * ) {
   printf( "its a sqrt\n" );
}

static void printLog2Type( struct MyStructure * ) {
   printf( "its a log2\n" );
}

static void printLog2Type( struct MyStructure * ) {
   printf( "works somehow, but unreadable\n" );
}

/* Initializes the function pointers in the structure depending on the id. */
void setupVTable( struct MyStructure *s ) {
  if ( !strcmp( s->id, "sqrt" ) ) {
    s->printType = printSqrtType;
  } else if ( !strcmp( s->id, "log2" ) ) {
    s->printType = printLog2Type;
  } else {
    s->printType = printUnreadableType;
  }
}

With this in place, your original code can just do:

void f( struct MyStruct *s ) {
    s->printType( s );
}

That way, you centralize the type check in a single place instead of cluttering your code with a lot of switch statements.

Solution 4

I believe that the issue here is that in C, each case label in a switch statement must be an integer constant expression. From the C ISO spec, §6.8.4.2/3:

The expression of each case label shall be an integer constant expression [...]

(my emphasis)

The C spec then defines an "integer constant expression" as a constant expression where (§6.6/6):

An integer constant expression) shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants, sizeof expressions whose results are integer constants, and floating constants that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the sizeof operator.

(my emphasis again). This suggests that you cannot typecast a character literal (a pointer) to an integer in a case statement, since that cast isn't allowed in an integer constant expression.

Intuitively, the reason for this might be that on some implementations the actual location of the strings in the generated executable isn't necessarily specified until linking. Consequently, the compiler might not be able to emit very good code for the switch statement if the labels depended on a constant expression that depend indirectly on the address of those strings, since it might miss opportunities to compile jump tables, for example. This is just an example, but the more rigorous language of the spec explicitly forbids you from doing what you've described above.

Hope this helps!

Solution 5

i just ended up using this macro, similar to case #3 in the question or phresnels answer.

#define CHAR4_TO_INT32(a, b, c, d) ((((int32_t)a)<<24)+ (((int32_t)b)<<16) + (((int32_t)c)<<8)+ (((int32_t)d)<<0)) 

switch (* ((int*) &structure->id)) {
   case (CHAR4_TO_INT32('S','Q','R','T')): printf("its a sqrt!"); break;
}
Share:
14,996

Related videos on Youtube

i_want_to_learn
Author by

i_want_to_learn

Updated on June 04, 2022

Comments

  • i_want_to_learn
    i_want_to_learn almost 2 years

    I have several data structures, each having a field of 4 bytes.

    Since 4 bytes equal 1 int on my platform, I want to use them in case labels:

    switch (* ((int*) &structure->id)) {
       case (* ((int*) "sqrt")): printf("its a sqrt!"); break;
       case (* ((int*) "log2")): printf("its a log2!"); break;
       case (((int) 'A')<<8 + (int) 'B'): printf("works somehow, but unreadable"); break;
       default: printf("unknown id");
    }
    

    This results in a compile error, telling me the case expression does not reduce to an int.

    How can i use char arrays of limited size, and cast them into numerical types to use in switch/case?

    • hmakholm left over Monica
      hmakholm left over Monica over 12 years
      Is this for C++ (as in the question title) or for C99 (as in the tags)? I'm not sure the answer is different between the two, but seeing two different languages in the question with no clear reason is confusing.
    • hmakholm left over Monica
      hmakholm left over Monica over 12 years
      Now there are two languages in the title, one of which is repeated in the tags. Are we to infer from this that you're not speaking about C++ after all? Why is it still in the title then?
  • jwodder
    jwodder over 12 years
    In C99, the exactly-32-bit signed integer type is named int32_t, not int_32.
  • templatetypedef
    templatetypedef over 12 years
    I don't think this addresses the OP's original question. Can you elaborate on this?
  • i_want_to_learn
    i_want_to_learn over 12 years
    how do you use unions within case expressions? so that they reduce to constant expressions? (e.g. translating "sqrt" to the corresponding int?)
  • Sebastian Mach
    Sebastian Mach over 12 years
    Legally, you can only read the one union value that you have written to most recently. I.e., if you write to y[...], reading from x yields undefined behaviour.
  • Sebastian Mach
    Sebastian Mach over 12 years
    Also: The syntax for declaring arrays in C is type name [length], not type [length] name. Further the confusion int_32 / int32_t, this is enough for a downvote from my side (I wish everyone would justify his downvote like this, btw)
  • Peter Varga
    Peter Varga over 12 years
    an example maybe helps undestand what i meant: union { uint32_t f; char fc[4]; } x; x.f = 1; memcpy(x.fc, "abcd", 4); switch(x.f) { case 1: printf("found case %.*s\n", 4, x.fc); break; case 1684234849: printf("found by number %.*s\n", 4, x.fc); break; default: printf("nothing matched\n"); break; }
  • Sebastian Mach
    Sebastian Mach over 12 years
    That's undefined behaviour, @Peter.