Why aren't Python sets hashable?

44,021

Solution 1

Generally, only immutable objects are hashable in Python. The immutable variant of set() -- frozenset() -- is hashable.

Solution 2

Because they're mutable.

If they were hashable, a hash could silently become "invalid", and that would pretty much make hashing pointless.

Solution 3

From the Python docs:

hashable
An object is hashable if it has a hash value which never changes during its lifetime (it needs a hash() method), and can be compared to other objects (it needs an eq() or cmp() method). Hashable objects which compare equal must have the same hash value.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

All of Python’s immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Objects which are instances of user-defined classes are hashable by default; they all compare unequal, and their hash value is their id().

Solution 4

In case this helps... if you really need to convert unhashable things into hashable equivalents for some reason you might do something like this:

from collections import Hashable, MutableSet, MutableSequence, MutableMapping

def make_hashdict(value):
    """
    Inspired by https://stackoverflow.com/questions/1151658/python-hashable-dicts
     - with the added bonus that it inherits from the dict type of value
       so OrderedDict's maintain their order and other subclasses of dict() maintain their attributes
    """
    map_type = type(value)

    class HashableDict(map_type):
        def __init__(self, *args, **kwargs):
            super(HashableDict, self).__init__(*args, **kwargs)
        def __hash__(self):
            return hash(tuple(sorted(self.items())))

    hashDict = HashableDict(value)

    return hashDict


def make_hashable(value):
    if not isinstance(value, Hashable):
        if isinstance(value, MutableSet):
            value = frozenset(value)
        elif isinstance(value, MutableSequence):
            value = tuple(value)
        elif isinstance(value, MutableMapping):
            value = make_hashdict(value)

        return value

my_set = set()
my_set.add(make_hashable(['a', 'list']))
my_set.add(make_hashable({'a': 1, 'dict': 2}))
my_set.add(make_hashable({'a', 'new', 'set'}))

print my_set

My HashableDict implementation is the simplest and least rigorous example from here. If you need a more advanced HashableDict that supports pickling and other things, check the many other implementations. In my version above I wanted to preserve the original dict class, thus preserving the order of OrderedDicts. I also use AttrDict from here for attribute-like access.

My example above is not in any way authoritative, just my solution to a similar problem where I needed to store some things in a set and needed to "hashify" them first.

Share:
44,021

Related videos on Youtube

Dan Burton
Author by

Dan Burton

Updated on February 25, 2020

Comments

  • Dan Burton
    Dan Burton over 4 years

    I stumbled across a blog post detailing how to implement a powerset function in Python. So I went about trying my own way of doing it, and discovered that Python apparently cannot have a set of sets, since set is not hashable. This is irksome, since the definition of a powerset is that it is a set of sets, and I wanted to implement it using actual set operations.

    >>> set([ set() ])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unhashable type: 'set'
    

    Is there a good reason Python sets are not hashable?

    • Chris Eberle
      Chris Eberle about 13 years
      Anything that isn't immutable generally makes for a bad key. You can use tuples if you have to.
  • abarnert
    abarnert about 11 years
    And see the Python FAQ entry Why must dictionary keys be immutable?.
  • Albert
    Albert over 2 years
    Due to the sorted, this requires an order on the values. There are objects which define __eq__ and __hash__ but not an order (__le__ etc). You could use hash(frozenzet(self.items())) instead.