std::map insert or std::map find?

75,360

Solution 1

The answer is you do neither. Instead you want to do something suggested by Item 24 of Effective STL by Scott Meyers:

typedef map<int, int> MapType;    // Your map type may vary, just change the typedef

MapType mymap;
// Add elements to map here
int k = 4;   // assume we're searching for keys equal to 4
int v = 0;   // assume we want the value 0 associated with the key of 4

MapType::iterator lb = mymap.lower_bound(k);

if(lb != mymap.end() && !(mymap.key_comp()(k, lb->first)))
{
    // key already exists
    // update lb->second if you care to
}
else
{
    // the key does not exist in the map
    // add it to the map
    mymap.insert(lb, MapType::value_type(k, v));    // Use lb as a hint to insert,
                                                    // so it can avoid another lookup
}

Solution 2

The answer to this question also depends on how expensive it is to create the value type you're storing in the map:

typedef std::map <int, int> MapOfInts;
typedef std::pair <MapOfInts::iterator, bool> IResult;

void foo (MapOfInts & m, int k, int v) {
  IResult ir = m.insert (std::make_pair (k, v));
  if (ir.second) {
    // insertion took place (ie. new entry)
  }
  else if ( replaceEntry ( ir.first->first ) ) {
    ir.first->second = v;
  }
}

For a value type such as an int, the above will more efficient than a find followed by an insert (in the absence of compiler optimizations). As stated above, this is because the search through the map only takes place once.

However, the call to insert requires that you already have the new "value" constructed:

class LargeDataType { /* ... */ };
typedef std::map <int, LargeDataType> MapOfLargeDataType;
typedef std::pair <MapOfLargeDataType::iterator, bool> IResult;

void foo (MapOfLargeDataType & m, int k) {

  // This call is more expensive than a find through the map:
  LargeDataType const & v = VeryExpensiveCall ( /* ... */ );

  IResult ir = m.insert (std::make_pair (k, v));
  if (ir.second) {
    // insertion took place (ie. new entry)
  }
  else if ( replaceEntry ( ir.first->first ) ) {
    ir.first->second = v;
  }
}

In order to call 'insert' we are paying for the expensive call to construct our value type - and from what you said in the question you won't use this new value 20% of the time. In the above case, if changing the map value type is not an option then it is more efficient to first perform the 'find' to check if we need to construct the element.

Alternatively, the value type of the map can be changed to store handles to the data using your favourite smart pointer type. The call to insert uses a null pointer (very cheap to construct) and only if necessary is the new data type constructed.

Solution 3

There will be barely any difference in speed between the 2, find will return an iterator, insert does the same and will search the map anyway to determine if the entry already exists.

So.. its down to personal preference. I always try insert and then update if necessary, but some people don't like handling the pair that is returned.

Solution 4

I would think if you do a find then insert, the extra cost would be when you don't find the key and performing the insert after. It's sort of like looking through books in alphabetical order and not finding the book, then looking through the books again to see where to insert it. It boils down to how you will be handling the keys and if they are constantly changing. Now there is some flexibility in that if you don't find it, you can log, exception, do whatever you want...

Solution 5

If you are concerned about efficiency, you may want to check out hash_map<>.

Typically map<> is implemented as a binary tree. Depending on your needs, a hash_map may be more efficient.

Share:
75,360

Related videos on Youtube

Superpolock
Author by

Superpolock

I am a software developer working in the finance industry, primarily developing using C++. I am also an investor in a couple of bars.

Updated on October 17, 2020

Comments

  • Superpolock
    Superpolock over 3 years

    Assuming a map where you want to preserve existing entries. 20% of the time, the entry you are inserting is new data. Is there an advantage to doing std::map::find then std::map::insert using that returned iterator? Or is it quicker to attempt the insert and then act based on whether or not the iterator indicates the record was or was not inserted?

    • Superpolock
      Superpolock almost 15 years
      I was corrected and had intended to use std::map::lower_bound instead of std::map::find.
  • Kris Kumler
    Kris Kumler over 15 years
    No, if the entry exists, it returns a reference to the existing entry.
  • Superpolock
    Superpolock over 15 years
    Would've loved to. But there is no hash_map in the C++ standard library, and PHB's don't allow code outside of that.
  • puetzk
    puetzk over 15 years
    This is indeed how find works, the trick is that this combines the search needed by find and insert. Of course, so does just using insert and then looking at the second return value.
  • Richard Corden
    Richard Corden over 15 years
    Two questions: 1) How is using lower_bound different to using find for a map? 2) For a 'map', is it not the case that the right hand op of && is always true when 'lb != mymap.end()'?
  • netjeff
    netjeff over 15 years
    -1 for this answer. As Kris K said, using map[key]=value will overwrite the existing entry, not "preserve" it as required in the question. You can't test for existence using map[key], because it will return a default constructed object if key does not exist, and create that as the entry for the key
  • peterchen
    peterchen over 14 years
    @Richard: find() returns end() if the key does not exist, lower_bound returns the position where the item should be (which in turn can be used as insertion hint). @puetzek: Wouldn't "just insert" overwrite the referent value for existing keys? It's not sure if the OP desires that.
  • beldaz
    beldaz about 14 years
    std::tr1::unordered_map is the hash map that is proposed to be added to the next standard, and should be available within most current implementations of the STL.
  • Giovanni Funchal
    Giovanni Funchal almost 14 years
    anyone knows if there is something similar for unordered_map ?
  • beldaz
    beldaz over 13 years
    @Helltone: The MS (msdn.microsoft.com/en-us/library/bb982322.aspx) and IBM (publib.boulder.ibm.com/infocenter/comphelp/v9v111/topic/…) implementations appear to have hinted inserts, but I can't guess as to how they would work for unsorted data - how does one guess the appropriate insertion point?
  • srm
    srm over 10 years
    The point is to test if the map is already populated and only add/overwrite if it is not there. Using map[key] assumes the value is already there always.
  • srm
    srm over 10 years
    This is not exactly true. The STL is unlike most other libraries in that it provides explicit big-O requirements for most of its operations. There is a guaranteed difference between 2 * O(log n) and 1 * O(log n), regardless of what implementation the functions use to achieve that O(log n) behavior. Whether or not that difference is significant on your platform is a different question. But the difference will always be there.
  • Mark Ransom
    Mark Ransom over 10 years
    @srm defining big-O requirements still doesn't tell you how long an operation will take in absolute terms. The guaranteed difference you speak of doesn't exist.
  • default
    default over 10 years
    They don't. The hint is not used, its only for compatability. See here
  • KillianDS
    KillianDS over 10 years
    Because (certainly pre-C++11) using insert means you still have to create a std::map::value_type object, the accepted answer avoids even that.
  • Chris Drew
    Chris Drew over 9 years
    @puetzk "just using insert" is fine if the thing being inserted is cheap to create, see Richard Corden's answer. Scott Meyer's solution gets around this.
  • Chris Drew
    Chris Drew over 9 years
    @peterchen map::insert does not overwrite the existing value if it exists, see cplusplus.com/reference/map/map/insert.
  • Tyler
    Tyler almost 8 years
    Doesn't this change to upper_bound in C++11? See the section on complexity. en.cppreference.com/w/cpp/container/map/insert
  • unlut
    unlut over 3 years
    @Tyler lower_bound returns first key that is greater than or equal to (>=) searched key, and upper_bound returns first key that is greater (>) than searched key. If the key being searched is not in the map, lower_bound and upper_bound returns same iterator, so answer is still valid. Name "lower_bound" is inaccurate for the function.
  • Justme0
    Justme0 over 2 years
    We can use map::lower_bound instead of map::find but more tedious.