What is the maximum size of the map object in c++ and java?
Solution 1
In Java, the size()
of a HashMap
is of type int
, so there's an upper bound of 2^31-1 elements in the map.
In C++, map::max_size
returns the max. number of elements. In a vanilla map
, there's an upper bound of at most SIZE_T_MAX
elements, which is 2^64-1 on modern hardware.
Solution 2
std::map and hashmap are dynamic structures. They grow as elements are added, until the system is able to provide memory for them.
The max_size() member function gives the upper limit the class implementation (in code) is able to sustain, but that limit is normally wider than the system capacity the code itself run onto.
The system available memory depends also on what else the system is doing other than running your application.
You can empirically come to a reasonable number by querying the OS about the amount of free memory it can give to your process and divide it for the size of an element as "key plus value plus some overhead (usually 20 / 24 bytes)".
Solution 3
In C++, std::map
has a max_size()
member function (corresponding to the amount of data it can hold).
sizeof(std::map<...>)
will give you the size of the actual object (corresponding to the size of the actual object, not the data it holds).
Solution 4
For Java:
HashMap has an underlying store is an array which is always a power of 2 in size. The largest it can be is 2^30. With a default load factor of 0.75 it will try to grow and fail at around 750 million entries.
TreeMap is not limited and can have more than 2^31 entries (however the size() will return MAX_VALUE) Similarly for ConcurrentSkipList and ConcurrentHashMap.
Solution 5
Some information to keep in mind (the big picture):
If your data is huge you can't hold it in memory. You have to go to secondary storage: HDD. When you go to HDD you lose the speed optimizations of a hashmap. Every time you go to the HDD you incur a delay (seek time and such). Searching a hashmap stored on disk becomes linear time.
What I'm trying to say is that a map is useless if your data can't fit in memory.
A better solution is to index your data. Store the indices in memory, and have a pointer to where on disk that data you're looking for is. Retrieve the data from disk.
Improve this model further by using RAID for storage. Also going to DB results in the same delay as going to HDD.
I suggest you store all the values in a DB, and keep an in-memory dictionary with hashes as keys.
Related videos on Youtube
user1061293
Updated on June 04, 2022Comments
-
user1061293 almost 2 years
What is the maximum size of hashmap/map object in c++ and java? I want to use hashmap, but i am working on huge data. I am worrying if i use this on large data, it may crash because of its capacity limit. Is that so? If so, what can be the alternative way?
-
Marcelo over 12 yearsHave you considered using a database?
-
-
Fred Foo over 12 years... but the "size of the actual object" doesn't really mean anything; it's a very minimal lower bound on the actual memory use, only to be used by allocators.
-
Drew Dormann over 12 yearsNeither of those expressions will report the actual memory used by the entire map.
-
Prasanth Kumar over 12 years@Drew, no, but the first one answers precisely what the OP was asking.
-
Fred Foo over 12 yearsThere is a hard limit: the maximum value of
int
, since that's the return type ofsize()
.