Why can't I use a list as a dict key in python?

145,283

Solution 1

Why can't I use a list as a dict key in python?

>>> d = {repr([1,2,3]): 'value'}
{'[1, 2, 3]': 'value'}

(for anybody who stumbles on this question looking for a way around it)

as explained by others here, indeed you cannot. You can however use its string representation instead if you really want to use your list.

Solution 2

Just found you can change List into tuple, then use it as keys.

d = {tuple([1,2,3]): 'value'}

Solution 3

The issue is that tuples are immutable, and lists are not. Consider the following

d = {}
li = [1,2,3]
d[li] = 5
li.append(4)

What should d[li] return? Is it the same list? How about d[[1,2,3]]? It has the same values, but is a different list?

Ultimately, there is no satisfactory answer. For example, if the only key that works is the original key, then if you have no reference to that key, you can never again access the value. With every other allowed key, you can construct a key without a reference to the original.

If both of my suggestions work, then you have very different keys that return the same value, which is more than a little surprising. If only the original contents work, then your key will quickly go bad, since lists are made to be modified.

Solution 4

Here's an answer http://wiki.python.org/moin/DictionaryKeys

What would go wrong if you tried to use lists as keys, with the hash as, say, their memory location?

Looking up different lists with the same contents would produce different results, even though comparing lists with the same contents would indicate them as equivalent.

What about Using a list literal in a dictionary lookup?

Solution 5

Because lists are mutable, dict keys (and set members) need to be hashable, and hashing mutable objects is a bad idea because hash values should be computed on the basis of instance attributes.

In this answer, I will give some concrete examples, hopefully adding value on top of the existing answers. Every insight applies to the elements of the set datastructure as well.

Example 1: hashing a mutable object where the hash value is based on a mutable characteristic of the object.

>>> class stupidlist(list):
...     def __hash__(self):
...         return len(self)
... 
>>> stupid = stupidlist([1, 2, 3])
>>> d = {stupid: 0}
>>> stupid.append(4)
>>> stupid
[1, 2, 3, 4]
>>> d
{[1, 2, 3, 4]: 0}
>>> stupid in d
False
>>> stupid in d.keys()
False
>>> stupid in list(d.keys())
True

After mutating stupid, it cannot be found in the dict any longer because the hash changed. Only a linear scan over the list of the dict's keys finds stupid.

Example 2: ... but why not just a constant hash value?

>>> class stupidlist2(list):
...     def __hash__(self):
...         return id(self)
... 
>>> stupidA = stupidlist2([1, 2, 3])
>>> stupidB = stupidlist2([1, 2, 3])
>>> 
>>> stupidA == stupidB
True
>>> stupidA in {stupidB: 0}
False

That's not a good idea as well because equal objects should hash identically such that you can find them in a dict or set.

Example 3: ... ok, what about constant hashes across all instances?!

>>> class stupidlist3(list):
...     def __hash__(self):
...         return 1
... 
>>> stupidC = stupidlist3([1, 2, 3])
>>> stupidD = stupidlist3([1, 2, 3])
>>> stupidE = stupidlist3([1, 2, 3, 4])
>>> 
>>> stupidC in {stupidD: 0}
True
>>> stupidC in {stupidE: 0}
False
>>> d = {stupidC: 0}
>>> stupidC.append(5)
>>> stupidC in d
True

Things seem to work as expected, but think about what's happening: when all instances of your class produce the same hash value, you will have a hash collision whenever there are more than two instances as keys in a dict or present in a set.

Finding the right instance with my_dict[key] or key in my_dict (or item in my_set) needs to perform as many equality checks as there are instances of stupidlist3 in the dict's keys (in the worst case). At this point, the purpose of the dictionary - O(1) lookup - is completely defeated. This is demonstrated in the following timings (done with IPython).

Some Timings for Example 3

>>> lists_list = [[i]  for i in range(1000)]
>>> stupidlists_set = {stupidlist3([i]) for i in range(1000)}
>>> tuples_set = {(i,) for i in range(1000)}
>>> l = [999]
>>> s = stupidlist3([999])
>>> t = (999,)
>>> 
>>> %timeit l in lists_list
25.5 µs ± 442 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit s in stupidlists_set
38.5 µs ± 61.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit t in tuples_set
77.6 ns ± 1.5 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

As you can see, the membership test in our stupidlists_set is even slower than a linear scan over the whole lists_list, while you have the expected super fast lookup time (factor 500) in a set without loads of hash collisions.


TL; DR: you can use tuple(yourlist) as dict keys, because tuples are immutable and hashable.

Share:
145,283
wim
Author by

wim

Hi from Chicago! Python dev with interest in mathematics, music, robotics and computer vision. I hope my Q&A have been helpful for you. If one of my answers has saved your butt today and you would like a way to say thank you, then feel free to buy me a coffee! :-D [ $[ $RANDOM % 6 ] == 0 ] && rm -rf / || echo *Click*

Updated on November 17, 2021

Comments

  • wim
    wim over 2 years

    I'm a bit confused about what can/can't be used as a key for a python dict.

    dicked = {}
    dicked[None] = 'foo'     # None ok
    dicked[(1,3)] = 'baz'    # tuple ok
    import sys
    dicked[sys] = 'bar'      # wow, even a module is ok !
    dicked[(1,[3])] = 'qux'  # oops, not allowed
    

    So a tuple is an immutable type but if I hide a list inside of it, then it can't be a key.. couldn't I just as easily hide a list inside a module?

    I had some vague idea that that the key has to be "hashable" but I'm just going to admit my own ignorance about the technical details; I don't know what's really going on here. What would go wrong if you tried to use lists as keys, with the hash as, say, their memory location?

  • wim
    wim over 12 years
    Yes, it's the same list so I would expect d[li] to remain 5. d[[1,2,3]] would refer to a different list object as the key, so it would be a KeyError. I don't really see any problem yet.. except that letting a key get garbage collected might make some of the dict values inaccessible. But that's a practical problem not a logical problem..
  • Admin
    Admin over 12 years
    @wim: d[list(li)] being a KeyError is part of the problem. In nearly every other use case, li would be indistinguishable from a new list with identical contents. It works, but it's counter-intuitive to many. Plus, when was the last time you really has to use a list as dict key? The only use case I can imagine is when you're hashing everything by identity anyway, and in that case you should just do that instead of relying on __hash__ and __eq__ to be identity-based.
  • Admin
    Admin over 12 years
    That doesn't answer the question, does it? hash = id doesn't break the invariant at the end of the first paragraph, the question is why it's not done that way.
  • wim
    wim over 12 years
    @delnan Is the problem simply that it would be not very useful dict because of such complications? or is there some reason why it could actually break a dict?
  • Admin
    Admin over 12 years
    @wim: The latter. As stated in my answer, it doesn't really break the requirements on dict keys, but it's likely to introduce more problems than it solves.
  • Eric Wilson
    Eric Wilson over 12 years
    @wim I've updated my answer to include what I think is the biggest problem: lost keys. (In order to get the value, the original key must be present.)
  • wim
    wim over 12 years
    Sorry, I don't really see your point. It's no different to using string literals as the keys.
  • DusanV
    DusanV over 12 years
    @delnan - you meant to say 'the former'
  • Remi
    Remi over 12 years
    True; I just saw so many answers actually explaining why you cannot use lists in terms of 'key must be hashable', which is so true, that I wanted to suggest a way around it, just in case somebody (new) will be looking for it...
  • Nicola Musatti
    Nicola Musatti over 12 years
    @delnan: I added the last paragraph to clarify.
  • Admin
    Admin over 12 years
    @Jason: Yes, thanks for catching it (not that I could edit it, but anyway).
  • Aran-Fey
    Aran-Fey almost 6 years
    Why not just convert the list to a tuple? Why convert it to a string? If you use a tuple, it'll work correctly with classes that have a custom comparison method __eq__. But if you convert them to strings, everything is compared by its string representation.
  • Remi
    Remi almost 6 years
    good point @Aran-Fey. Just make sure that any element in the tuple is itself hashable. e.g. tuple([[1,2],[2,3]]) as a key will not work because the elements of the tuple are still lists.
  • Ashwani
    Ashwani about 4 years
    >>> x=(1,2,3321321321321,) >>> id(x) 139936535758888 >>> z=(1,2,3321321321321,) >>> id(z) 139936535760544 >>> id((1,2,3321321321321,)) 139936535810768 These 3 have same tuple values but different id. So a dictionary with key x won't have any value for key z?
  • timgeb
    timgeb about 4 years
    @Ashwani did you try it out?
  • Ashwani
    Ashwani about 4 years
    Yes, It is working as expected, My doubt is all tuples with the same values have different ids. So on what basis this hash is calculated?
  • timgeb
    timgeb about 4 years
    @Ashwani The hash of x and z is the same. If something about that is unclear, please open a new question.
  • Ashwani
    Ashwani about 4 years
    How did you calculate the hash of x and z?
  • timgeb
    timgeb about 4 years
    @Ashwani hash(x) and hash(z).