std::vector faster than std::unordered_set?

10,944

Solution 1

There are two possibilities I can think of:

  1. You have a small enough number of data elements that a linear search is faster than a hash-plus compare lookup.
  2. You're using the same contains function to find an element in the unordered_set, instead of using the member function find.

Solution 2

If the number of duplicate bodies isn't that high compared to the others, one option might be to just push all your bodies into the vector and remove the duplicates afterwards. But this will require a std::sort followed by an erase(std::unique, end).

But it may be worth a try, considering that your vector seems to outplay a std::unordered_set anyway, which doesn't have the same memory locality and trivial access like a std::vector.

Share:
10,944
Vittorio Romeo
Author by

Vittorio Romeo

I write code, lift weights and play games. I also like everything sci-fi.

Updated on July 20, 2022

Comments

  • Vittorio Romeo
    Vittorio Romeo almost 2 years

    In my custom physics engine, the biggest bottleneck is a method that gets all bodies from the spatial partitioning (a 2D grid), and returns a collection containing only unique pointers to body.

    template<typename T, typename V> bool contains(const T& mContainer, const V& mValue)
    {
        return std::find(std::begin(mContainer), 
                         std::end(mContainer), mValue) != std::end(mContainer);
    }
    
    const vector<Body*>& GridInfo::getBodiesToCheck()
    {
        bodiesToCheck.clear();
        for(auto& query : queries)
            for(auto& body : *query)
                if(!contains(bodiesToCheck, body)) bodiesToCheck.push_back(body);
        return bodiesToCheck;
    }
    

    Using a profiler shows that the bottleneck is in the "contains" method.

    Obviously, an std::unordered_set would be the "ideal" solution here. However, it is a lot slower than the current solution. I've also tried google::dense_hash_set, which is faster than std::unordered_set, but still slower than the current solution.

    const unordered_set<Body*>& GridInfo::getBodiesToCheck()
    {
        bodiesToCheck.clear();
        for(auto& query : queries)
            for(auto& body : *query)
                /*if(!contains(bodiesToCheck, body))*/ bodiesToCheck.insert(body);
        return bodiesToCheck;
    }
    

    Why are the "correct" containers slower than a std::vector?

    Is there any way I can speed up this method further?

  • Vittorio Romeo
    Vittorio Romeo about 11 years
    As I only care about returning a collection of unique Body*, I didn't use "contains" or "find" on unordered_set. I just used insert expecting it to only be filled with unique elements.
  • Christian Rau
    Christian Rau about 11 years
    But then again when using unordered_map there is absolutely no need for std::find anyway (and the OP confirmed to not have done such stupid mistake).
  • Vittorio Romeo
    Vittorio Romeo about 11 years
    I tried it, but the performance is slower than what I currently have.
  • Christian Rau
    Christian Rau about 11 years
    "If you really need better performance, the only data container I can think of would be some kind of hash table" - Uh...you mean...like a std::unordered_set?
  • Mppl
    Mppl about 11 years
    Yeah, you're right... Unordered set is indeed an Hash table... My bad.
  • Christian Rau
    Christian Rau about 11 years
    I guess that std::unique_set should be a std::unordered_set? Other than that, I don't think there is any need for him to iterate over the std::unordered_set, at least not in the code snippet in question (and the one he profiled and wants to speed up). It's just std::vector+std::find vs std::unordered_set::insert, so in your case he would have the same overhead like his existing std::unordered_set solution and the overhead of a vector insert.
  • David Rodríguez - dribeas
    David Rodríguez - dribeas about 11 years
    @ChristianRau: Yes, unordered (need to get caffeine infused immediately!)