Protocol buffers and enums combinations?

20,890

Solution 1

In Protobufs, an enum-typed field is only allowed to have one of the exact numeric values specified in the enum. That is to say, you cannot use an enum-typed field as a bitfield. If you want a bitfield, you need to use an integer type like int32. This rule actually applies even in languages that have numeric enum types, like C++ -- if an enum-typed protobuf field read from the wire has an invalid value, it will be treated like an unknown field and thus hidden.

If you switch to integers, you of course now have the problem of how to declare flag values. Unfortunately Protobufs provides no good way to define constants. As you suggested in your self-answer, you can use a dummy enum definition as a hack, but note that the numeric value won't necessarily be available in all languages. It works in C++ and Python since they use numeric enums (and apparently C# too?). In Java, Protobuf enums have a .getNumber() method which you can use to get the numeric value; otherwise, normal Java enums are not numeric.

(Aside: I'm the author of most of Google's open source Protobuf code. I'm also the author of Cap'n Proto, a newer non-Google project aimed at replacing Protobufs. Among other advantages, Cap'n Proto supports defining constants in schema files. But, as of this writing C# support is not ready yet (though being worked on!).)

Solution 2

If you don't need to squeeze out every last inch of efficiency (hint: you probably don't), then just use an array of enum values.

message Msg {
    // ...
    enum Code
    {
        MSG = 0;
        FILE = 1;
        APPROVE = 2;
        ACK = 3;
        ERROR_SENDING = 4;
        WORLD = 5;
    }
    repeated Code codes = 5;
}

Much later edit: The official protobuf docs recommend you reserve an enum entry equal to 0 to mean something like "unknown". It's really targetted at enums that are used as non-repeated values (because in proto3 there's no difference between the 0 enum value and unset) but worth following for all enums. In this case, that means you'd replace the above with UNKNOWN = 0, MSG = 1, etc.

Solution 3

You can use message instead of enums, and use bool type for the flags you need.

Here's an example for a simple Alarm Clock schema where it can be set for multiple days in the week:

message Alarm {
    uint32 hour = 1;
    uint32 minute = 2;
    bool repeat = 3;
    DaysOfWeek daysOfWeek = 4;
    message DaysOfWeek {
        bool sunday = 1;
        bool monday = 2;
        bool tuesday = 3;
        bool wednesday = 4;
        bool thursday = 5;
        bool friday = 6;
        bool saturday = 7;
    }
}

Solution 4

I found a solution (sort of)

need an int holder.

message Foo {
  enum Flags {
    FLAG1 = 0x01;
    FLAG2 = 0x02;
    FLAG3 = 0x04;
  }

  // Bitwise-OR of Flags.
  optional uint32 flags = 1;
  • Mmm, Is it the only solution ?

Solution 5

Define the field as an integer:

required int32 MsgCode = 1;

Define the enum as in your question, even though nothing in the .proto file will reference it.

Use the enum fields in your code. In C#, it's like your example (although it depends on which library you use, e.g. protobuf-net is excellent and has a lightweight Enum.Field syntax). In Java, use the fields with the _VALUE suffix, e.g. MsgCodes.APPROVE_VALUE.

Share:
20,890
Royi Namir
Author by

Royi Namir

Updated on July 09, 2022

Comments

  • Royi Namir
    Royi Namir almost 2 years

    This is my proto file :

    message MSG {
    
      required MsgCodes MsgCode = 1;
      optional int64 Serial = 2;        // Unique ID number for this person.
      required int32 From = 3;  
      required int32 To = 4;  
      //bla bla...
            enum MsgCodes
            {
                MSG = 1;
                FILE = 2;
                APPROVE=4;
                ACK=8;
                ERROR_SENDING=16;
                WORLD=32;
            }
    }
    

    In my C# I'm trying to :

     msg = msg.ToBuilder().SetMsgCode(msg.MsgCode | MSG.Types.MsgCodes.ACK | MSG.Types.MsgCodes.APPROVE).Build();
     SendToJava(msg);
    

    But the JAVA tells me : missing MsgCode ( which is a required)

    Removing the combination - does solve it

    But I need to specify combinations

    Question

    How can I solve it ?

    nb :

    The weird thing is that if I create a msg and set multiple enums , and then reads it in C# again - it does work...:-(

  • thegreendroid
    thegreendroid about 4 years
    This is pretty inefficient however as each field will have a tag ID present
  • Louis CAD
    Louis CAD about 4 years
    @thegreendroid Are you sure? I think it's positional and tag less when encoded in binary form.
  • thegreendroid
    thegreendroid about 4 years
    Pretty sure, according to this developers.google.com/protocol-buffers/docs/encoding, bool types have a wire type encoded.
  • Arthur Tacca
    Arthur Tacca over 3 years
    Isn't this the same as the self-answer by Royi that was posted almost a year earlier than this?
  • Edward Brey
    Edward Brey over 3 years
    @ArthurTacca His code sample shows the same general approach. His has the enum nested within the message, whereas my example (unqualified MsgCodes) has it at the same level. His answer is almost entirely a code sample; mine describes how to use it, including the nuance of using the _VALUE suffix in Java. It seemed like there was too much to add to either edit his answer to place in a comment, so I added my own answer.
  • Jason Doucette
    Jason Doucette about 2 years
    The official doc for proto 3 ( developers.google.com/protocol-buffers/docs/proto3#enum ) says "During deserialization, unrecognized enum values will be preserved in the message ... if the message is serialized the unrecognized value will still be serialized with the message."
  • Kenton Varda
    Kenton Varda about 2 years
    @JasonDoucette That's correct. Unknown enum values are treated the same way as unknown field tags. In both cases the value is saved off to the side, in the message's UnknownFieldSet. If the message is serialized again, the unknown fields are merged back into the output.