Theoretical limit for number of keys (objects) that can be stored in a HashMap?

34,359

Solution 1

Is there a theoretical limit for the number of key entries that can be stored in a HashMap or does it purely depend on the heapmemory available ?

Looking at the documentation of that class, I would say that the theoretical limit is Integer.MAX_VALUE (231-1 = 2147483647) elements.

This is because to properly implement this class, the size() method is obliged to return an int representing the number of key/value pairs.

From the documentation of HashMap.size()

Returns: the number of key-value mappings in this map

Note: This question is very similar to How many data a list can hold at the maximum.


which data structure is the best to store a very large number of objects(say several hundred thousand objects)?

I would say it depends on what you need to store and what type of access you require. All built in collections are probably well optimized for large quantities.

Solution 2

HashMap holds the values in an array, which can hold up to Integer.MAX_VALUE. But this does not count collisions. Each Entry has a next field, which is also an entry. This is how collisions (two or more objects with the same hashcode) are resolved. So I wouldn't say there is any limit (apart from the available memory)

Note that if you exceed Integer.MAX_VALUE, you'll get unexpected behaviour from some methods, like size(), but get() and put() will still work. And they will work, because the hashCode() of any object will return an int, hence by definition each object will fit in the map. And then each object will collide with an existing one.

Share:
34,359
Ebbu Abraham
Author by

Ebbu Abraham

Wondering what to do with this life !!

Updated on July 15, 2022

Comments

  • Ebbu Abraham
    Ebbu Abraham almost 2 years

    Is there a theoretical limit for the number of key entries that can be stored in a HashMap or does the maximum purely depend on the heap memory available?

    Also, which data structure is the best to store a very large number of objects (say several hundred thousand objects)?

  • jarnbjo
    jarnbjo over 13 years
    The documentation of HashMap#size() actually limits the number of key-value mappings to a number representable by an int. If a HashMap would allow more mappings, the documentation of the size() method would have been inherited from Map#size(), which defines the method to return Integer.MAX_VALUE if the size is > Integer.MAX_VALUE.
  • pgras
    pgras over 13 years
    And I just noticed the size():int method :)
  • Bozho
    Bozho over 13 years
    hashCode() returns int. Hence after the table is full, there will be only collisions.
  • gustafc
    gustafc over 13 years
    Actually, size() is not really a problem: If the map contains more than Integer.MAX_VALUE elements, returns Integer.MAX_VALUE. - download.oracle.com/javase/6/docs/api/java/util/Map.html#siz‌​e() Only client code not recognizing the true meaning of size() should be a problem.
  • aioobe
    aioobe over 13 years
    @gustafc, ah, interesting point. However, the documentation of HashMap.size() refines this specification: download.oracle.com/javase/6/docs/api/java/util/… It's questionable though, whether or not that's probably a "bug" in the specification :-)
  • gustafc
    gustafc over 13 years
    @aioobe - well, looking at the Sunoracle implementation of HashMap, it doesn't really seem to grok that the size field can overflow. Spec compliance? :) So maybe I should've phrased that "size() should not really be a problem"...
  • aioobe
    aioobe over 13 years
    @gustafc, provided there is no one already, are you filing a bug-report, or shall I do it? If you do it, I'd be interested in a link to the report.
  • jarnbjo
    jarnbjo over 13 years
    @gustafc: Even if the definition of Map#size() allows maps to contain more than Integer.MAX_VALUE elements, a concrete implementation like HashMap does of course not have to support more than Integer.MAX_VALUE elements and it's IMHO obviously clear that HashMap does not.
  • aioobe
    aioobe over 13 years
    @jarnbjo, I've read through the source now, and I can't see there is a problem. If they just keep track of the size-overflow, and revise the documentation for size() I'd say that there wouldn't be a limitation to Integer.MAX_VALUE...
  • gustafc
    gustafc over 13 years
    @aioobe, bug filed (but not available at the time of writing): bugs.sun.com/bugdatabase/view_bug.do?bug_id=6998213
  • aioobe
    aioobe over 13 years
    @gustafc, oh crap. I filed one too hehe :-) well that should give the issue some attention :-)
  • jarnbjo
    jarnbjo over 13 years
    @gustafc: They can't change the implementation now, since it would not be backwards compatible. Existing code relies on that if HashMap#size() returns Interger.MAX_VALUE, the map contains Integer.MAX_VALUE elements and not more.
  • aioobe
    aioobe over 13 years
    @jarnbjo, reading the code it looks like the size-variable would overflow... and the HashMap#size() returns the value of the size-variable.
  • gustafc
    gustafc over 13 years
    @jarnbjo, I'm not entirely convinced this won't be fixed - we can just as well argue that there may be code relying on HashMap correctly implementing Map. We'll see who's right when my or aioobe's bug reports are responded to.
  • aioobe
    aioobe over 9 years
    For the record, the bug has been re-filed here.
  • Sudhir Dhumal
    Sudhir Dhumal about 5 years
    2<sup>30 is the max size of the HashMap but you can store more elements than the 2<sup>30. If you look at the HashMap implementation then you can find the variable declaration static final int MAXIMUM_CAPACITY = 1 << 30; After 2<sup>30 hash map will start behaving weirdly but it does allow more numbers than the 2<sup>30.
  • Sudhir Dhumal
    Sudhir Dhumal about 5 years
    This thread is putting it in better way: stackoverflow.com/questions/19886017/…
  • aioobe
    aioobe about 5 years
    @SudhirDhumal, from what I can tell, you're wrong in 2 regards. 1) The question is about the general contract, and you're talking about details of a specific implementation (OpenJDK I assume?). The reasoning does not necessarily hold true for other implementations (GNU Classpath being one example) or even possible future versions of OpenJDK. The only authoritative source here is the contract of the class, which is spelled out in the signature of the methods and the Javadoc. 2) The capacity is the number of buckets. Each bucket can contain many items.
  • vijaya kumar
    vijaya kumar almost 3 years
    If 2 power 31-1 is the limit of hashmap.. what will happen if i add one more entry. would i get exception?
  • aioobe
    aioobe almost 3 years
    @vijayakumar I don't think that's defined.