Choosing between std::map and std::unordered_map

94,753

Solution 1

As already mentioned, map allows to iterate over the elements in a sorted way, but unordered_map does not. This is very important in many situations, for example displaying a collection (e.g. address book). This also manifests in other indirect ways like: (1) Start iterating from the iterator returned by find(), or (2) existence of member functions like lower_bound().

Also, I think there is some difference in the worst case search complexity.

  • For map, it is O( lg N )

  • For unordered_map, it is O( N ) [This may happen when the hash function is not good leading to too many hash collisions.]

The same is applicable for worst case deletion complexity.

Solution 2

In addition to the answers above you should also note that just because unordered_map is constant speed (O(1)) doesn't mean that it's faster than map (of order log(N)). The constant may be bigger than log(N) especially since N is limited by 232 (or 264).

So in addition to the other answers (map maintains order and hash functions may be difficult) it may be that map is more performant.

For example in a program I ran for a blog post I saw that for VS10 std::unordered_map was slower than std::map (although boost::unordered_map was faster than both).

Performance Graph

Note 3rd through 5th bars.

Solution 3

This is due to Google's Chandler Carruth in his CppCon 2014 lecture

std::map is (considered by many to be) not useful for performance-oriented work: If you want O(1)-amortized access, use a proper associative array (or for lack of one, std::unorderded_map); if you want sorted sequential access, use something based on a vector.

Also, std::map is a balanced tree; and you have to traverse it, or re-balance it, incredibly often. These are cache-killer and cache-apocalypse operations respectively... so just say NO to std::map.

You might be interested in this SO question on efficient hash map implementations.

(PS - std::unordered_map is cache-unfriendly because it uses linked lists as buckets.)

Solution 4

I think it's obvious that you'd use the std::map you need to iterate across items in the map in sorted order.

You might also use it when you'd prefer to write a comparison operator (which is intuitive) instead of a hash function (which is generally very unintuitive).

Solution 5

Say you have very large keys, perhaps large strings. To create a hash value for a large string you need to go through the whole string from beginning to end. It will take at least linear time to the length of the key. However, when you only search a binary tree using the > operator of the key each string comparison can return when the first mismatch is found. This is typically very early for large strings.

This reasoning can be applied to the find function of std::unordered_map and std::map. If the nature of the key is such that it takes longer to produce a hash (in the case of std::unordered_map) than it takes to find the location of an element using binary search (in the case of std::map), it should be faster to lookup a key in the std::map. It's quite easy to think of scenarios where this would be the case, but they would be quite rare in practice i believe.

Share:
94,753
Johann Gerell
Author by

Johann Gerell

Updated on August 20, 2020

Comments

  • Johann Gerell
    Johann Gerell over 3 years

    Now that std has a real hash map in unordered_map, why (or when) would I still want to use the good old map over unordered_map on systems where it actually exists? Are there any obvious situations that I cannot immediately see?

  • paulm
    paulm over 9 years
    what is the value of N in this graph?
  • Motti
    Motti over 9 years
    @paulm, as I stated in the blog post N=10,000,000.
  • Tony Delroy
    Tony Delroy over 8 years
    The blog link has gone the way of the dodo, and the results presented here are of little value without that context, as the time needed to hash vs. compare things varies hugely with the exact hash function, data type, length, and values. That's particularly important with the VC++ Standard Library, as hash functions are fast but collision prone: numbers passed through unaltered, only 10 characters spaced along a string of any length are combined in the hash value, bucket counts aren't prime. (GNU is at the opposite end of the spectrum).
  • Motti
    Motti over 8 years
    @TonyD, the blog post link still works for me.
  • Maxim Galushka
    Maxim Galushka over 8 years
    you are right about worst case, but this post is somehow misleading - as on average std::unordered_map is O(1) for search complexity which is much better then std::map
  • n. m.
    n. m. about 8 years
    VS10, that's your problem right here.
  • Motti
    Motti about 8 years
    @n.m. Sadly my time machine wasn't working on Oct 11 2010.
  • Erik Alapää
    Erik Alapää over 6 years
    It is very important to understand that for some applications, worst case performance is crucial to know and is the deciding factor. For some hard real-time systems, having a linear worst case like the hashtable is not acceptable. std::map is always O(lg N), which is a very nice property to have.
  • Sirmabus
    Sirmabus over 2 years
    This is a perfectly valid post. I have found this myself. It's a good example of theory vs reality. In theory a hash map should be faster, but they might actually be slower. Never take for granted what textbooks nor what the "experts" say. Especially for performance desktop CPUs caching can be a big factor in container algorithms. Theory can be way off from reality. For when performance really matters profile things yourself and experiment trying the various options. You might be surprised what you find. Plus there are down to clock cycle profiling tools for most platforms to tune things.