Why does std::unordered_map have a reserve method?
The unordered_map
container has a reserve
method because it is implemented using buckets, and not a tree as in map
.
A bucket is:
a slot in the container's internal hash table to which elements are assigned based on the hash value of their key. Buckets are numbered from 0 to (bucket_count-1). (source)
A single bucket holds a variable number of items. This number is based on the load_factor
. When the load_factor
reaches a certain threshold, the container increases the number of buckets and rehashes the map.
When you call reserve(n)
, the container creates enough buckets to hold at least n
items.
This is in contrast to rehash(n)
which directly sets the number of buckets to n
and triggers a rebuild of the entire hash table.
See also: Pre-allocating buckets in a C++ unordered_map
Edit in Response to Comments
As I do not know the exact answer to the question posed in the comments, and as my preliminary research did not prove fruitful, I decided to test it experimentally.
For reference, the question boils down to:
Could you please explain if reserving buckets for n elements is the same as allocating memory for n elements?
According to this answer, accurately retrieving the size of the allocated space in an unordered_map
is tricky and unreliable. So I decided to make use of Visual Studio 2015's diagnostic tools.
First, my test case is as follows:
#include <unordered_map>
#include <cstdint>
struct Foo
{
Foo() : x(0.0f), y(0.0f), z(0.0f) { }
float x;
float y;
float z;
};
int32_t main(int32_t argc, char** argv)
{
std::unordered_map<uint32_t, Foo> mapNoReserve;
std::unordered_map<uint32_t, Foo> mapReserve;
// --> Snapshot A
mapReserve.reserve(1000);
// --> Snapshot B
for(uint32_t i = 0; i < 1000; ++i)
{
mapNoReserve.insert(std::make_pair(i, Foo()));
mapReserve.insert(std::make_pair(i, Foo()));
}
// -> Snapshot C
return 0;
}
Where the comments indicate, I took a memory snapshot.
The results were as follows:
Snapshot A:
┌──────────────┬──────────────┬──────────────┐
| Map | Size (Bytes) | Bucket Count |
|--------------|--------------|--------------|
| mapNoReserve | 64 | 8 |
| mapReserve | 64 | 8 |
└──────────────┴──────────────┴──────────────┚
Snapshot B:
┌──────────────┬──────────────┬──────────────┐
| Map | Size (Bytes) | Bucket Count |
|--------------|--------------|--------------|
| mapNoReserve | 64 | 8 |
| mapReserve | 8231 | 1024 |
└──────────────┴──────────────┴──────────────┚
Snapshot C:
┌──────────────┬──────────────┬──────────────┐
| Map | Size (Bytes) | Bucket Count |
|--------------|--------------|--------------|
| mapNoReserve | 24024 | 1024 |
| mapReserve | 24024 | 1024 |
└──────────────┴──────────────┴──────────────┚
Interpretation:
As you can see from the snapshot, it appears that both maps grow in size once we start adding elements to them, even the one that had called reserve
.
So does reserve
offer a benefit even though memory is still allocated? I would say yes for two reasons: (1) It pre-allocates the memory for the buckets, and (2) it can prevent the need of a rehash
, which as discussed earlier completely rebuilds the map.
Related videos on Youtube
Mikubyte
Updated on September 15, 2022Comments
-
Mikubyte over 1 year
According to this you cannot reserve space for
std::map
:No, the members of the map are internally stored in a tree structure. There is no way to build the tree until you know the keys and values that are to be stored.
From this it is obvious why
std::map
would lack areserve()
method, which it does on cppreference.com. However,std::unordered_map
does have areserve()
method, but when I try to use it withoperator[]
,insert()
oremplace()
they all go to allocate memory despite me having calledreserve()
first.What's up with this? Why won't
reserve()
properly reserve the required space? And if it's as with the map that you cannot allocate memory beforehand, then why doesstd::unordered_map
even have areserve()
method in the first place?-
nos about 7 yearsCan you desccribe the method you used to determine that operator[], insert() or emplace() allocate memory instead of using the memory set aside from .reserve() ?
-
Mikubyte about 7 years@nos I stepped through the assembly in every call and they all ended up in some sort of
List_buy()
orBuyNode()
call which eventually calledoperator new()
and thenmalloc()
.
-
-
Mikubyte about 7 yearsIn the linked question it's said that both
rehash()
andreserve()
pre-allocate buckets, but it is also said thatreserve(pow(2, x))
: but now pow(2, x) is the number of elements you are planning on inserting. What does that mean? Does it mean thatreserve()
will know that if you give it 1 million as argument it'll need X buckets? Regardless, if you reserve the buckets you need, then why do you still need to allocate memory? Is the reservation just an optimization so that you won't have to create new buckets later on, but actual elements will still need memory allocated for them? -
ssell about 7 yearsYes, when you call
reserve(n)
, enough buckets are created to hold at leastn
items. If you then add> n
items to the map, a rehash may be triggered depending on the load factor. -
Mikubyte about 7 yearsI still don't see how this differs from allocated memory, though, sorry. Could you please explain if reserving buckets for
n
elements is the same as allocating memory forn
elements? Because clearly it callsnew()
internally when I insert despite reserving. -
Mikubyte about 7 yearsThank you for the detailed answer, it cleared up some misconceptions even though I ended up not using a map at all (for reasons unrelated to the question).
-
ssell about 7 yearsNo worries, thank you for the good question. I learned a decent bit about the internal workings of
unordered_map
along the way myself. -
ローウ over 5 yearsCould you please explain how you obtained the size (in bytes) of those
std::unordered_map
objects? -
ssell over 5 years@ネロクThe size was determined by taking snapshots of memory usage in Visual Studio's Diagnostic Tools. For more information, see MSDN - Memory Usage