std::hash_set vs std::unordered_set, are they the same thing?

20,627

Solution 1

The complexity requirements for the unordered_-containers set out by the C++ standard essentially don't leave much room for the implementation, which has to be some sort of hash table. The standard was written in full awareness that those data structures had already been deployed by most vendors as an extension.

Compiler vendors would typically call those containers "hash map" or "hash set", which is what you're probably referring to (there is no literal std::hash_set in the standard, but I think there's one in GCC in a separate namespace, and similarly for other compilers).

When the new standard was written, the authors wanted to avoid possible confusion with existing extension libraries, so they went for a name that reflects the typical C++ mindset: say what it is, not how it's implemented. The unordered containers are, well, unordered. That means you get less from them compared to the ordered containers, but this diminished utility affords you more efficient access.

Implementation-wise, hash_set, Boost-unordered, TR1-unordered and C++11-unordered will be very similar, if not identical.

Solution 2

Regarding the question "are they the same thing" from the subject line: based on my experience of upgrading code from __gnu_cxx::hash_set to std::unordered_set, they are almost, but not exactly, the same thing.

The difference that I ran into is that iterating through __gnu_cxx::hash_set returned the items in what appeared to be the original order of insertion, whereas std::unordered_set would not. So as the name implies, one cannot rely on an iterator to return the items in any particular order when iterating though the entire std::unordered_set.

Solution 3

Visual Studio 2010 for example has both hash_xxx and unordered_xxx, and if you look through the headers, atleast their implementation is the same for all of those (same base-/"policy"-classes). For other compilers, I don't know, but due to how hash container usually have to be implemented, I guess there won't be many differences, if any at all.

Solution 4

They are pretty much the same things. The standard (C++0x) name is unordered_set. hash_set was an earlier name from boost and others.

Share:
20,627

Related videos on Youtube

unixman83
Author by

unixman83

I am a computer programmer. My favorite programming languages are C++, and Perl.

Updated on August 22, 2020

Comments

  • unixman83
    unixman83 over 3 years

    I know hash_set is non-standard and unordered_set is standard. However, I am wondering, performance wise, what is the difference between the two? Why do they exist separately?

    • Jonathan Grynspan
      Jonathan Grynspan over 12 years
      They exist separately because one was created, and then the other was made part of the draft standard. They weren't created at the same time.
    • Nicol Bolas
      Nicol Bolas over 12 years
      @JonathanGrynspan: Why don't you make that an answer? Since it, you know, answers the question ;)
    • unixman83
      unixman83 over 12 years
      Do they both use the same algorithm?
    • Jonathan Grynspan
      Jonathan Grynspan over 12 years
      @Nicol It only answers part of the question. :) I don't know anything about the performance characteristics of one class vs. the other, so I don't have a complete answer.
    • Nicol Bolas
      Nicol Bolas over 12 years
      @JonathanGrynspan: They're hash tables. They have the performance characteristics of hash tables. If they didn't, then they wouldn't be hash tables anymore. Now, whether they're good implementations of hash tables depends on the particular implementation of the class, which is not something that can be answered in general.
    • Jonathan Grynspan
      Jonathan Grynspan over 12 years
      @Nicol Which would be the reason I didn't answer that part of the question; As far as anyone knows, I don't know anything about implementation details--not even the stuff I do know. :P
  • unixman83
    unixman83 over 12 years
    by pretty much, you mean only the name differs? MSVC includes them both, that's why I am curious.
  • David Nehme
    David Nehme over 12 years
    MSVC has a hash_set in an earlier implementation. They are likely keeping it for a while to make it easier on developers who used hash_set. MS moved the hash_set out of std and into a the stdext namespace. You should use unordered_set for any new code. The specific algorithm by either will be compiler dependent.
  • Christian Rau
    Christian Rau over 12 years
    And also the interface of MSVC's hash_set is slightly different from that of unordered_set, whereas GCC's hash_set interface is rather similar to unordered_set, if I remember correctly.
  • h9uest
    h9uest about 9 years
    I think the namespace for hash_set you referred to is __gnu_cxx.