Java: Composite key in hashmaps

50,934

Solution 1

You could have a custom object containing the two strings:

class StringKey {
    private String str1;
    private String str2;
}

Problem is, you need to determine the equality test and the hash code for two such objects.

Equality could be the match on both strings and the hashcode could be the hashcode of the concatenated members (this is debatable):

class StringKey {
    private String str1;
    private String str2;

    @Override
    public boolean equals(Object obj) {
        if(obj != null && obj instanceof StringKey) {
            StringKey s = (StringKey)obj;
            return str1.equals(s.str1) && str2.equals(s.str2);
        }
        return false;
    }

    @Override
    public int hashCode() {
        return (str1 + str2).hashCode();
    }
}

Solution 2

You don't need to reinvent the wheel. Simply use the Guava's HashBasedTable<R,C,V> implementation of Table<R,C,V> interface, for your need. Here is an example

Table<String, String, Integer> table = HashBasedTable.create();

table.put("key-1", "lock-1", 50);
table.put("lock-1", "key-1", 100);

System.out.println(table.get("key-1", "lock-1")); //prints 50
System.out.println(table.get("lock-1", "key-1")); //prints 100

table.put("key-1", "lock-1", 150); //replaces 50 with 150

Happy coding!

Solution 3

public int hashCode() {
    return (str1 + str2).hashCode();
}

This seems to be a terrible way to generate the hashCode: Creating a new string instance every time the hash code is computed is terrible! (Even generating the string instance once and caching the result is poor practice.)

There are a lot of suggestions here:

How do I calculate a good hash code for a list of strings?

public int hashCode() {
    final int prime = 31;
    int result = 1;
    for ( String s : strings ) {
        result = result * prime + s.hashCode();
    }
    return result;
}

For a pair of strings, that becomes:

return string1.hashCode() * 31 + string2.hashCode();

That is a very basic implementation. Lots of advice through the link to suggest better tuned strategies.

Solution 4

Why not create a (say) Pair object, which contains the two strings as members, and then use this as the key ?

e.g.

public class Pair {
   private final String str1;
   private final String str2;

   // this object should be immutable to reliably perform subsequent lookups
}

Don't forget about equals() and hashCode(). See this blog entry for more on HashMaps and keys, including a background on the immutability requirements. If your key isn't immutable, then you can change its components and a subsequent lookup will fail to locate it (this is why immutable objects such as String are good candidates for a key)

You're right that concatenation isn't ideal. For some circumstances it'll work, but it's often an unreliable and fragile solution (e.g. is AB/C a different key from A/BC ?).

Solution 5

I have a similar case. All I do is concatenate the two strings separated by a tilde ( ~ ).

So when the client calls the service function to get the object from the map, it looks like this:

MyObject getMyObject(String key1, String key2) {
    String cacheKey = key1 + "~" + key2;
    return map.get(cachekey);
}

It is simple, but it works.

Share:
50,934
user1203861
Author by

user1203861

Updated on December 10, 2020

Comments

  • user1203861
    user1203861 over 3 years

    I would like to store a group of objects in a hashmap , where the key shall be a composite of two string values. is there a way to achieve this?

    i can simply concatenate the two strings , but im sure there is a better way to do this.

  • Scott McIntyre
    Scott McIntyre over 11 years
    Are there any problems due to the hashcodes for ABC,D and AB,CD being the same? Or does the equals being different resolve that?
  • Tudor
    Tudor over 11 years
    @smackfu: That depends. It would only be a problem if you have many such pairs of strings, because they will hash to the same slot in the table and make lookups less efficient.
  • Zak
    Zak about 11 years
    @Tudor can you think of any advantage that this solution has over the solution presented by EdgeCase which basically just concatenates the two strings separated by a tilde character?
  • Tudor
    Tudor about 11 years
    @Zak: Well the only difference is that I'm using the two strings concatenated and he's using a tilde between them. His version should do a better job at spreading around pairs of strings that would otherwise give the same result when concatenated. Anyway, I was just giving an example of how to produce the hashcode, I did not aim to make it efficient.
  • bgs
    bgs over 10 years
    @Tudor, you don't need obj != null with instanceOf operator, Cheers!
  • mike rodent
    mike rodent over 7 years
    "a new string instance every time the hash code is computed" - hahaha, well spotted!
  • lolo
    lolo almost 7 years
    if we have to many entries (~77,500) it can we can found ourselves with hash collision?
  • SOFe
    SOFe almost 6 years
    does it make any difference if the hashcode is calculated by XOR instead of * and +, since the String hashCode uses most bits in the hashcode anyway?
  • Thomas Bitonti
    Thomas Bitonti almost 6 years
    XOR and other bitwise operators are faster than multiplication and addition. Whether the difference is significant in the example depends on the implementation of String.hashCode(). Also, one should be extra careful to make sure the new implementation has good properties as a hash function.
  • Thomas Bitonti
    Thomas Bitonti almost 6 years
    The requirement on hashCode is that it is stable (multiple calls obtain the same value in the lifetime of the receiver), and that equal objects have the same hashCode value. Having the same hash values for unequal objects (for example, hash( {"a", "aa"} ) == hash( {"aa", "a"} ) ) impacts the efficiency of the hash tables. Operations on unequal objects that have the same hash value lose the performance benefits of hash tables. What will matter is the proportion of such objects.
  • Thomas Bitonti
    Thomas Bitonti almost 6 years
    Collisions will occur in proportion to the range of the hash function, the number of entries being hashed, the distribution of the values of the entries, how well behaved is the hash function, and the available space for the hash map. Without know more of these details, whether 77,500 entries will have many or few (or no) collisions can't be determined. Note that even with very good hash functions, collisions will likely occur. For a typical hash map implementation, what will matter is not whether any collisions occur, but how many in proportion to the total entries.
  • mike rodent
    mike rodent about 5 years
    Yes. The OP stipulated "two strings" explicitly. Composite keys of another kind may require something far more sophisticated. But this is the right answer for the use case, IMHO. However, the "join" character needs more work: the OP did not say that these strings are "only alphanumeric" for example, so they could potentially contain the tilde character. Something very exotic from the wilder planes of Unicode might do the trick otherwise: 𒀛 or ♄ maybe.
  • unknownerror
    unknownerror almost 5 years
    i would also prefer defining the variables as final. Ensures immutability in case of strings.
  • Moshi
    Moshi almost 3 years
    This implementation is better than the accepted answer.