What is the maximum size of the map object in c++ and java?

14,491

Solution 1

In Java, the size() of a HashMap is of type int, so there's an upper bound of 2^31-1 elements in the map.

In C++, map::max_size returns the max. number of elements. In a vanilla map, there's an upper bound of at most SIZE_T_MAX elements, which is 2^64-1 on modern hardware.

Solution 2

std::map and hashmap are dynamic structures. They grow as elements are added, until the system is able to provide memory for them.

The max_size() member function gives the upper limit the class implementation (in code) is able to sustain, but that limit is normally wider than the system capacity the code itself run onto.

The system available memory depends also on what else the system is doing other than running your application.

You can empirically come to a reasonable number by querying the OS about the amount of free memory it can give to your process and divide it for the size of an element as "key plus value plus some overhead (usually 20 / 24 bytes)".

Solution 3

In C++, std::map has a max_size() member function (corresponding to the amount of data it can hold).

sizeof(std::map<...>) will give you the size of the actual object (corresponding to the size of the actual object, not the data it holds).

Solution 4

For Java:

HashMap has an underlying store is an array which is always a power of 2 in size. The largest it can be is 2^30. With a default load factor of 0.75 it will try to grow and fail at around 750 million entries.

TreeMap is not limited and can have more than 2^31 entries (however the size() will return MAX_VALUE) Similarly for ConcurrentSkipList and ConcurrentHashMap.

Solution 5

Some information to keep in mind (the big picture):

If your data is huge you can't hold it in memory. You have to go to secondary storage: HDD. When you go to HDD you lose the speed optimizations of a hashmap. Every time you go to the HDD you incur a delay (seek time and such). Searching a hashmap stored on disk becomes linear time.

What I'm trying to say is that a map is useless if your data can't fit in memory.

A better solution is to index your data. Store the indices in memory, and have a pointer to where on disk that data you're looking for is. Retrieve the data from disk.

Improve this model further by using RAID for storage. Also going to DB results in the same delay as going to HDD.

I suggest you store all the values in a DB, and keep an in-memory dictionary with hashes as keys.

Share:
14,491

Related videos on Youtube

user1061293
Author by

user1061293

Updated on June 04, 2022

Comments

  • user1061293
    user1061293 almost 2 years

    What is the maximum size of hashmap/map object in c++ and java? I want to use hashmap, but i am working on huge data. I am worrying if i use this on large data, it may crash because of its capacity limit. Is that so? If so, what can be the alternative way?

    • Marcelo
      Marcelo over 12 years
      Have you considered using a database?
  • Fred Foo
    Fred Foo over 12 years
    ... but the "size of the actual object" doesn't really mean anything; it's a very minimal lower bound on the actual memory use, only to be used by allocators.
  • Drew Dormann
    Drew Dormann over 12 years
    Neither of those expressions will report the actual memory used by the entire map.
  • Prasanth Kumar
    Prasanth Kumar over 12 years
    @Drew, no, but the first one answers precisely what the OP was asking.
  • Fred Foo
    Fred Foo over 12 years
    There is a hard limit: the maximum value of int, since that's the return type of size().