How to model HashMap/Dictionary in the ProtoBuf efficiently

31,704

Well, maps are already supported in "protobuf proper" as of v3.0. For example, your proto is effectively:

message Dictionary {
    map<string, string> pairs = 1;
}

The good news is that with the key and value fields you've defined, that's fully backward-compatible with your existing data :)

The bad news is that I don't know whether or not protobuf-net supports it. If you're not actually using the .proto file on the .NET side, and doing everything declaratively, you may just be able to modify your .proto file, regenerate the Java code, and go...

The remaining bad news is that maps were introduced in v3.0 which is still in alpha/beta at the time of this writing. Now, depending on when you need to ship, you may decide to bet on v3.0 being released by the time you need it - the benefits of having nice map syntax are pretty significant, in my view. Most of the changes being made at the moment are around the new proto3 features - whereas maps are allowed within proto2 syntax files too... it's just that you need the v3.0 compiler and runtime to use them.

Share:
31,704
Lan
Author by

Lan

Enterprise Architect with 13 years of experience, interested in Middleware, JEE development, SOA, Cloud Computing and Big Data

Updated on October 27, 2020

Comments

  • Lan
    Lan over 3 years

    I have a protobuf file serialized by .NET code and I would like to consume it into Java. In the .NET code, there is Dictionary data type and the proto schema looks like

    message Pair {
       optional string key = 1;
       optional string value = 2;
    }
    
    message Dictionary {
       repeated Pair pairs = 1;
    }
    

    Just as described in stackoverflow post Dictionary in protocol buffers.

    I can use protoc to compile the proto file into Java classes fine. I can deserialize the protobuf file into Java objects successfully. The only problem is that it translates to a List of Pair objects in Java instead of HashMap. Of course, I still have all the data, but I cannot access the data as efficiently as I prefer. If I have the value of the key, I have to loop through the whole list to get its corresponding value. This does not seem to be optimal.

    I am wondering if there is a better way to model Dictionary/Map data type in the protobuf.

    Thanks

    Update:

    I tried Jon Skeet's suggestion to add map type field in the addressbook example and still ran into issue.

    message Person {
      required string name = 1;
      required int32 id = 2;        // Unique ID number for this person.
      optional string email = 3;
      enum PhoneType {
        MOBILE = 0;
        HOME = 1;
        WORK = 2;
      }
      message PhoneNumber {
        required string number = 1;
        optional PhoneType type = 2 [default = HOME];
      }
      repeated PhoneNumber phone = 4;
      map<string, string> mapdata = 5;
    }
    

    The protoc throws error when compiling

    addressbook.proto:25:3: Expected "required", "optional", or "repeated".
    addressbook.proto:25:6: Expected field name.
    

    According to Google protobuf doc, proto 2 does support map type https://developers.google.com/protocol-buffers/docs/proto#maps . As I quote,

    Maps cannot be repeated, optional, or required.

    So I don't really know why protoc cannot compile it. There is another discussion here have to create java pojo for the existing proto includes Map. The answer suggests that map is only a proto 3 feature. This contradicts google's documentation.

  • Lan
    Lan over 8 years
    Thanks, Jon. Yes, I don't think protobuf-net supports it. If there is a dictionary type, it generates the repeated key-value pair type, instead of map. As you guessed, the protobuf was defined declaritively in .NET side as annoatation. We have to use protobuf-net to generate to .proto file from code and make modification manually (a lot). I will verify the backward compatibility part and report back.
  • Lan
    Lan over 8 years
    I tried your suggestion, adding a map field in the addressbook example. But the protoc throws error addressbook.proto:31:5: Expected "required", "optional", or "repeated". addressbook.proto:31:8: Expected field name. My protoc version is 2.6.1. Look at stackoverflow.com/questions/29407123/… and it reports the same issue.
  • Jon Skeet
    Jon Skeet over 8 years
    @Lan: You'll need a more recent version of protoc. Will edit to clarify that.
  • Lan
    Lan over 8 years
    I reached the same conclusion after reading some post on the mailinglist. groups.google.com/forum/#!topic/protobuf/p4WxcplrlA4. After download the protobuf 3 alpha 3, I can get the map working. And it does seem to be backward compatible. Thanks for the help