GetHashCode() with string keys

17,975

Solution 1

You can call GetHashCode() on the non-numeric values that you use in your object.

private string m_foo;
public override int GetHashCode()
{
    return m_foo.GetHashCode();
}

Solution 2

This is not a good pattern for generating hashes for an object.

It's important to undunderstand the purpose of GetHashCode() - it's a way to generate a numeric representation of the identifying properties of an object. Hash codes are used to allow an object to serve as a key in a dictionary and in some cases accelerate comparisons between complex types.

If you simply generate a random value and call it a hash code, you have no repeatability. Another instance with the same key fields will have a different hash code, and will violate the behavior expected by classes like HashSet, Dictionary, etc.

If you already have an identifying string member in you object, just return its hash code.

The documentation on MSDN for implementers of GetHashCode() is a must read for anyone that plans on overriding that method:

Notes to Implementers

A hash function is used to quickly generate a number (hash code) that corresponds to the value of an object. Hash functions are usually specific to each Type and, for uniqueness, must use at least one of the instance fields as input.

A hash function must have the following properties:

If two objects compare as equal, the GetHashCode method for each object must return the same value. However, if two objects do not compare as equal, the GetHashCode methods for the two object do not have to return different values.

The GetHashCode method for an object must consistently return the same hash code as long as there is no modification to the object state that determines the return value of the object's Equals method. Note that this is true only for the current execution of an application, and that a different hash code can be returned if the application is run again.

For the best performance, a hash function must generate a random distribution for all input.

For example, the implementation of the GetHashCode method provided by the String class returns identical hash codes for identical string values. Therefore, two String objects return the same hash code if they represent the same string value. Also, the method uses all the characters in the string to generate reasonably randomly distributed output, even when the input is clustered in certain ranges (for example, many users might have strings that contain only the lower 128 ASCII characters, even though a string can contain any of the 65,535 Unicode characters).

Solution 3

Hash codes don't have to be unique. Provided your Equals implementation is correct, it's OK to return the same hash code for two instances. The m_next_hash_id logic is broken, since it allows two objects to have different hash codes even if they compare equals.

MSDN gives a good set of instructions on how to implement Equals and GetHashCode. Several of the examples here implement GetHashCode in terms of the hash codes of an object's fields

Share:
17,975
King Skippus
Author by

King Skippus

Updated on June 01, 2022

Comments

  • King Skippus
    King Skippus almost 2 years

    Hey all, I've been reading up on the best way to implement the GetHashCode() override for objects in .NET, and most answers I run across involve somehow munging numbers together from members that are numeric types to come up with a method. Problem is, I have an object that uses an alphanumeric string as its key, and I'm wondering if there's something fundamentally wrong with just using an internal ID for objects with strings as keys, something like the following?

    
    // Override GetHashCode() to return a permanent, unique identifier for
    // this object.
    static private int m_next_hash_id = 1;
    private int m_hash_code = 0;
    public override int GetHashCode() {
      if (this.m_hash_code == 0)
        this.m_hash_code = <type>.m_next_hash_id++;
      return this.m_hash_code;
    }
    

    Is there a better way to come up with a unique hash code for an object that uses an alphanumeric string as its key? (And no, the numeric parts of the alphanumeric string isn't unique; some of these strings don't actually have numbers in them at all.) Any thoughts would be appreciated!