Choosing between std::map and std::unordered_map
Solution 1
As already mentioned, map
allows to iterate over the elements in a sorted way, but unordered_map
does not. This is very important in many situations, for example displaying a collection (e.g. address book). This also manifests in other indirect ways like: (1) Start iterating from the iterator returned by find()
, or (2) existence of member functions like lower_bound()
.
Also, I think there is some difference in the worst case search complexity.
For
map
, it is O( lg N )For
unordered_map
, it is O( N ) [This may happen when the hash function is not good leading to too many hash collisions.]
The same is applicable for worst case deletion complexity.
Solution 2
In addition to the answers above you should also note that just because unordered_map
is constant speed (O(1)
) doesn't mean that it's faster than map
(of order log(N)
). The constant may be bigger than log(N)
especially since N
is limited by 232 (or 264).
So in addition to the other answers (map
maintains order and hash functions may be difficult) it may be that map
is more performant.
For example in a program I ran for a blog post I saw that for VS10 std::unordered_map
was slower than std::map
(although boost::unordered_map
was faster than both).
Note 3rd through 5th bars.
Solution 3
This is due to Google's Chandler Carruth in his CppCon 2014 lecture
std::map
is (considered by many to be) not useful for performance-oriented work: If you want O(1)-amortized access, use a proper associative array (or for lack of one, std::unorderded_map
); if you want sorted sequential access, use something based on a vector.
Also, std::map
is a balanced tree; and you have to traverse it, or re-balance it, incredibly often. These are cache-killer and cache-apocalypse operations respectively... so just say NO to std::map
.
You might be interested in this SO question on efficient hash map implementations.
(PS - std::unordered_map
is cache-unfriendly because it uses linked lists as buckets.)
Solution 4
I think it's obvious that you'd use the std::map
you need to iterate across items in the map in sorted order.
You might also use it when you'd prefer to write a comparison operator (which is intuitive) instead of a hash function (which is generally very unintuitive).
Solution 5
Say you have very large keys, perhaps large strings. To create a hash value for a large string you need to go through the whole string from beginning to end. It will take at least linear time to the length of the key. However, when you only search a binary tree using the >
operator of the key each string comparison can return when the first mismatch is found. This is typically very early for large strings.
This reasoning can be applied to the find
function of std::unordered_map
and std::map
. If the nature of the key is such that it takes longer to produce a hash (in the case of std::unordered_map
) than it takes to find the location of an element using binary search (in the case of std::map
), it should be faster to lookup a key in the std::map
. It's quite easy to think of scenarios where this would be the case, but they would be quite rare in practice i believe.
Johann Gerell
Updated on August 20, 2020Comments
-
Johann Gerell over 3 years
Now that
std
has a real hash map inunordered_map
, why (or when) would I still want to use the good oldmap
overunordered_map
on systems where it actually exists? Are there any obvious situations that I cannot immediately see? -
paulm over 9 yearswhat is the value of N in this graph?
-
Motti over 9 years@paulm, as I stated in the blog post
N=10,000,000
. -
Tony Delroy over 8 yearsThe blog link has gone the way of the dodo, and the results presented here are of little value without that context, as the time needed to hash vs. compare things varies hugely with the exact hash function, data type, length, and values. That's particularly important with the VC++ Standard Library, as hash functions are fast but collision prone: numbers passed through unaltered, only 10 characters spaced along a string of any length are combined in the hash value, bucket counts aren't prime. (GNU is at the opposite end of the spectrum).
-
Motti over 8 years@TonyD, the blog post link still works for me.
-
Maxim Galushka over 8 yearsyou are right about worst case, but this post is somehow misleading - as on average std::unordered_map is O(1) for search complexity which is much better then std::map
-
n. m. about 8 yearsVS10, that's your problem right here.
-
Motti about 8 years@n.m. Sadly my time machine wasn't working on Oct 11 2010.
-
Erik Alapää over 6 yearsIt is very important to understand that for some applications, worst case performance is crucial to know and is the deciding factor. For some hard real-time systems, having a linear worst case like the hashtable is not acceptable. std::map is always O(lg N), which is a very nice property to have.
-
Sirmabus over 2 yearsThis is a perfectly valid post. I have found this myself. It's a good example of theory vs reality. In theory a hash map should be faster, but they might actually be slower. Never take for granted what textbooks nor what the "experts" say. Especially for performance desktop CPUs caching can be a big factor in container algorithms. Theory can be way off from reality. For when performance really matters profile things yourself and experiment trying the various options. You might be surprised what you find. Plus there are down to clock cycle profiling tools for most platforms to tune things.