Which STL Container to use?

25,673

Solution 1

I think you should check this SO post: In which scenario do I use a particular STL container? for small sizes vector will suit most scenarios irrespective of what you intend to do.

The chart is a guide though, the fact that the container is accessed regularly does not affect container choice, the fact that you are storing int is unimportant unless you care about the size of the container, in which case does the overhead of the pointers in a list container or map matter to you?

Sorting is done automatically by map but sorting a vector and list can be very fast if the container size is small enough to fit in memory.

Data insertion is optimised for lists and maps anywhere in the container, for maps you get the benefit that it will sort itself but again if the size is small enough then constructing a new vector with the new entry could be very fast still.

You may also want to consider hash maps, you would still be best to profile your code, trying to second guess what is optimal depends on your usage and you really need to measure and profile.

You could also just decide that an STL <map> is a fine enough balance or a <set> and use those containers as they automatically sort on insertion and deletion and look up is fast but there is the overhead of maintaining the pointers in each entry that increases the size of the memory used compared to vector, if you don't care about this then you could consider these containers.

Still if it matters then test and profile and compare the performance of each container, you will be surprised by how the code will perform against your assumptions.

Solution 2

If the requirement is just performance, the choice should basically always be a std::vector.

It avoids the many memory allocations of node-based data structures (trees and lists), and it exploits spatial locality for much more efficient traversal.

Of course, insertions/removals at the middle of the vector require elements to be moved, but even that is rarely enough to make the vector slower than other data structures.

The only real reasons I see for using other data structures are these:

  • std::map/std::set: those are great for convenience. Nice and easy to use, so if optimal perfomance isn't required, I use those when I need a sorted container, or a key/value map. (for best performance, a sorted vector may very well be preferable)
  • all other containers: may be useful for the correctness guarantees the offer in the face of modifications: the vector frequently reallocates and moves its contents, which invalidates both pointers and iterators into the vector. The other data structures offer stronger guarantees there (for a deque, pointers are guaranteed to stay valid after after insertion/removal at the ends, but iterators may still be invalidated. For list, set and map, both pointers and iterators are guaranteed to stay valid during insertion/removal)

Of course, these are just rules of thumb.

The only universally true rule when performance is involved is "benchmark it yourself". I can tell you how a vector typically performs in many common scenarios, but I can't tell you how it performs in your code, with your compiler and your standard library. So if you worry about performance, measure it. Try out the different alternatives, and see which is faster.

Solution 3

A set is efficient enough to insert/remove/access and it is always sorted. The only thing to consider is that entries in sets are const (so the ordering is not broken), so to change, you should remove, update and insert

Solution 4

The answer to your question is completely dependent on your data set size, as a list grows to to huge sizes , the time it takes to do the linear traversal to get to the element you need to remove / insert at far outweighs the time it takes for a vector to do a removal/ insertion. So if your data set is small, go with lists, if it's huge, go with vector.

Solution 5

If it needs to be sorted, use a Binary Search Tree

Share:
25,673
mister
Author by

mister

Research &amp; Development. Hack Stuff.

Updated on September 17, 2020

Comments

  • mister
    mister almost 4 years

    Which STL container should i use if:

    1. Data is inserted and removed regularly.
    2. Data is accessed regularly at random.

    E.g : dataset(4,10,15) if i want to find the closest number to 9, then it should return me 10.

    1. I am only storing an integer.
    2. It needs to be sorted
    3. Can go to 100k datasets

    I thought of using vector, but vector insertion and removing is expensive.

       vector<int>
    

    If i were to use list, i would have to access O(n) elements before reaching the data.

       list<int>
    

    I was thinking of using set as it will be good if it is sorted, but im not very sure about the efficiencies for using SET

    So i hope someone can give a good solution!