Deleting elements from std::set while iterating

114,686

Solution 1

This is implementation dependent:

Standard 23.1.2.8:

The insert members shall not affect the validity of iterators and references to the container, and the erase members shall invalidate only iterators and references to the erased elements.

Maybe you could try this -- this is standard conforming:

for (auto it = numbers.begin(); it != numbers.end(); ) {
    if (*it % 2 == 0) {
        numbers.erase(it++);
    }
    else {
        ++it;
    }
}

Note that it++ is postfix, hence it passes the old position to erase, but first jumps to a newer one due to the operator.

2015.10.27 update: C++11 has resolved the defect. iterator erase (const_iterator position); return an iterator to the element that follows the last element removed (or set::end, if the last element was removed). So C++11 style is:

for (auto it = numbers.begin(); it != numbers.end(); ) {
    if (*it % 2 == 0) {
        it = numbers.erase(it);
    }
    else {
        ++it;
    }
}

Solution 2

If you run your program through valgrind, you'll see a bunch of read errors. In other words, yes, the iterators are being invalidated, but you're getting lucky in your example (or really unlucky, as you're not seeing the negative effects of undefined behavior). One solution to this is to create a temporary iterator, increment the temp, delete the target iterator, then set the target to the temp. For example, re-write your loop as follows:

std::set<int>::iterator it = numbers.begin();                               
std::set<int>::iterator tmp;                                                

// iterate through the set and erase all even numbers                       
for ( ; it != numbers.end(); )                                              
{                                                                           
    int n = *it;                                                            
    if (n % 2 == 0)                                                         
    {                                                                       
        tmp = it;                                                           
        ++tmp;                                                              
        numbers.erase(it);                                                  
        it = tmp;                                                           
    }                                                                       
    else                                                                    
    {                                                                       
        ++it;                                                               
    }                                                                       
} 

Solution 3

You misunderstand what "undefined behavior" means. Undefined behavior does not mean "if you do this, your program will crash or produce unexpected results." It means "if you do this, your program could crash or produce unexpected results", or do anything else, depending on your compiler, your operating system, the phase of the moon, etc.

If something executes without crashing and behaves as you expect it to, that is not proof that it is not undefined behavior. All it proves is that its behavior happened to be as observed for that particular run after compiling with that particular compiler on that particular operating system.

Erasing an element from a set invalidates the iterator to the erased element. Using an invalidated iterator is undefined behavior. It just so happened that the observed behavior was what you intended in this particular instance; it does not mean that the code is correct.

Solution 4

C++20 will have "uniform container erasure", and you'll be able to write:

std::erase_if(numbers, [](int n){ return n % 2 == 0 });

And that will work for vector, set, deque, etc. See cppReference for more info.

Solution 5

Just to warn, that in case of a deque container, all solutions that check for the deque iterator equality to numbers.end() will likely fail on gcc 4.8.4. Namely, erasing an element of the deque generally invalidates pointer to numbers.end():

#include <iostream>
#include <deque>

using namespace std;
int main() 
{

  deque<int> numbers;

  numbers.push_back(0);
  numbers.push_back(1);
  numbers.push_back(2);
  numbers.push_back(3);
  //numbers.push_back(4);

  deque<int>::iterator  it_end = numbers.end();

  for (deque<int>::iterator it = numbers.begin(); it != numbers.end(); ) {
    if (*it % 2 == 0) {
      cout << "Erasing element: " << *it << "\n";
      numbers.erase(it++);
      if (it_end == numbers.end()) {
    cout << "it_end is still pointing to numbers.end()\n";
      } else {
    cout << "it_end is not anymore pointing to numbers.end()\n";
      }
    }
    else {
      cout << "Skipping element: " << *it << "\n";
      ++it;
    }
  }
}

Output:

Erasing element: 0
it_end is still pointing to numbers.end()
Skipping element: 1
Erasing element: 2
it_end is not anymore pointing to numbers.end()

Note that while the deque transformation is correct in this particular case, the end pointer has been invalidated along the way. With the deque of a different size the error is more apparent:

int main() 
{

  deque<int> numbers;

  numbers.push_back(0);
  numbers.push_back(1);
  numbers.push_back(2);
  numbers.push_back(3);
  numbers.push_back(4);

  deque<int>::iterator  it_end = numbers.end();

  for (deque<int>::iterator it = numbers.begin(); it != numbers.end(); ) {
    if (*it % 2 == 0) {
      cout << "Erasing element: " << *it << "\n";
      numbers.erase(it++);
      if (it_end == numbers.end()) {
    cout << "it_end is still pointing to numbers.end()\n";
      } else {
    cout << "it_end is not anymore pointing to numbers.end()\n";
      }
    }
    else {
      cout << "Skipping element: " << *it << "\n";
      ++it;
    }
  }
}

Output:

Erasing element: 0
it_end is still pointing to numbers.end()
Skipping element: 1
Erasing element: 2
it_end is still pointing to numbers.end()
Skipping element: 3
Erasing element: 4
it_end is not anymore pointing to numbers.end()
Erasing element: 0
it_end is not anymore pointing to numbers.end()
Erasing element: 0
it_end is not anymore pointing to numbers.end()
...
Segmentation fault (core dumped)

Here is one of the ways to fix this:

#include <iostream>
#include <deque>

using namespace std;
int main() 
{

  deque<int> numbers;
  bool done_iterating = false;

  numbers.push_back(0);
  numbers.push_back(1);
  numbers.push_back(2);
  numbers.push_back(3);
  numbers.push_back(4);

  if (!numbers.empty()) {
    deque<int>::iterator it = numbers.begin();
    while (!done_iterating) {
      if (it + 1 == numbers.end()) {
    done_iterating = true;
      } 
      if (*it % 2 == 0) {
    cout << "Erasing element: " << *it << "\n";
      numbers.erase(it++);
      }
      else {
    cout << "Skipping element: " << *it << "\n";
    ++it;
      }
    }
  }
}
Share:
114,686

Related videos on Youtube

pedromanoel
Author by

pedromanoel

Aspiring Software Craftsman since I read Apprenticeship Patterns by Dave Hoover and Adewale Oshineye. I identify myself with Agile Methodologies, and try to learn as much as I can about it. I also like elegant code, and I strive to become better at doing it.

Updated on October 30, 2020

Comments

  • pedromanoel
    pedromanoel over 3 years

    I need to go through a set and remove elements that meet a predefined criteria.

    This is the test code I wrote:

    #include <set>
    #include <algorithm>
    
    void printElement(int value) {
        std::cout << value << " ";
    }
    
    int main() {
        int initNum[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
        std::set<int> numbers(initNum, initNum + 10);
        // print '0 1 2 3 4 5 6 7 8 9'
        std::for_each(numbers.begin(), numbers.end(), printElement);
    
        std::set<int>::iterator it = numbers.begin();
    
        // iterate through the set and erase all even numbers
        for (; it != numbers.end(); ++it) {
            int n = *it;
            if (n % 2 == 0) {
                // wouldn't invalidate the iterator?
                numbers.erase(it);
            }
        }
    
        // print '1 3 5 7 9'
        std::for_each(numbers.begin(), numbers.end(), printElement);
    
        return 0;
    }
    

    At first, I thought that erasing an element from the set while iterating through it would invalidate the iterator, and the increment at the for loop would have undefined behavior. Even though, I executed this test code and all went well, and I can't explain why.

    My question: Is this the defined behavior for std sets or is this implementation specific? I am using gcc 4.3.3 on ubuntu 10.04 (32-bit version), by the way.

    Thanks!

    Proposed solution:

    Is this a correct way to iterate and erase elements from the set?

    while(it != numbers.end()) {
        int n = *it;
        if (n % 2 == 0) {
            // post-increment operator returns a copy, then increment
            numbers.erase(it++);
        } else {
            // pre-increment operator increments, then return
            ++it;
        }
    }
    

    Edit: PREFERED SOLUTION

    I came around a solution that seems more elegant to me, even though it does exactly the same.

    while(it != numbers.end()) {
        // copy the current iterator then increment it
        std::set<int>::iterator current = it++;
        int n = *current;
        if (n % 2 == 0) {
            // don't invalidate iterator it, because it is already
            // pointing to the next element
            numbers.erase(current);
        }
    }
    

    If there are several test conditions inside the while, each one of them must increment the iterator. I like this code better because the iterator is incremented only in one place, making the code less error-prone and more readable.

    • Martin York
      Martin York almost 14 years
    • pedromanoel
      pedromanoel almost 14 years
      Actually, I read this question (and others) before asking mine, but since they were related to other STL containers and since my initial test apparently worked, I thought there was some difference between them. Only after Matt's answer I thought of using valgrind. Even though, I prefer my NEW solution over the others because it reduces the chances of errors by incrementing the iterator in only one place. Thank you all for the help!
    • Alnitak
      Alnitak over 11 years
      @pedromanoel ++it should be somewhat more efficient than it++ because it doesn't require the use of an invisible temporary copy of the iterator. Kornel's version whilst longer ensures that the non-filtered elements are iterated over most efficiently.
    • pedromanoel
      pedromanoel over 11 years
      @Alnitak I haven't thought about that, but I think that the difference in performance wouldn't be so great. The copy is created in his version too, but only for the elements that match. So the degree of optimization is totally dependent on the structure of the set. For quite some time I pre-optimized code, hurting readability and coding speed in the process... So I would perform some tests before using the other way.
    • bobobobo
      bobobobo over 11 years
    • Jeremy W. Murphy
      Jeremy W. Murphy over 10 years
      Is the original problem that you're trying solve to simply "remove elements (from a set) that meet a predefined criteria"? Because you don't even need iteration for that, which would make the more specific questions about erasing whilst iterating redundant. But if you have to erase whilst iterating, then I can't help you. :)
    • pedromanoel
      pedromanoel over 10 years
      Wow... I asked this question so long ago that I don't remember my original problem! What do you mean I don't need iteration in that case?
    • Jeremy W. Murphy
      Jeremy W. Murphy over 10 years
      Sorry, never mind, I had a misunderstanding about the STL. :)
    • Troyseph
      Troyseph over 8 years
      I'm having trouble with your preferred solution, I seem to be getting an infinite loop. I'm using a deque rather than a set however the rest of my code is a minimum test case for your proposed method...
  • Arkaitz Jimenez
    Arkaitz Jimenez almost 14 years
    Set<T>::erase version doesn't return iterator.
  • pedromanoel
    pedromanoel almost 14 years
    Oh, I am well aware that undefined behavior can also mean "It works for me, but not for everybody". That is why I asked this question, because I didn't know if this behavior was correct or not. If it was, than I would just leave like that. Using an while loop would solve my problem, then? I edited my question with my proposed solution. Please check it out.
  • UncleBens
    UncleBens almost 14 years
    It works for me too. But when I change the condition into if (n > 2 && n < 7 ) then I get 0 1 2 4 7 8 9. - The particular result here probably depends more on the implementation details of the erase method and set iterators, rather than on the phase of the moon (not that one should ever rely on implementation details). ;)
  • Eugene
    Eugene over 11 years
    Actually it does, but only on MSVC implementation. So this is truly an implementation specific answer. :)
  • kuroi neko
    kuroi neko over 9 years
    This does not work with deque on MSVC2013. Either their implementation is buggy or there is yet another requirement that prevents this from working on deque. The STL spec is so convoluted that you can't expect all implementations to follow it, let alone your casual programmer to memorize it. STL is a monster beyond taming, and since there is no unique implementation (and test suites, if any, apparently do not cover such obvious cases as deleting elements in a loop), that makes the STL a shiny brittle toy that can go up with a bang when you look at it sideways.
  • kuroi neko
    kuroi neko over 9 years
    STL adds a lot of new meanings to "undefined behaviour". For instance "Microsoft thought smart to enhance the spec by allowing std::set::erase to return an iterator, so your MSVC code will go up with a bang when compiled by gcc", or "Microsoft does bound checks on std::bitset::operator[] so your carefully optimized bitset algorithm will slow to a crawl when compiled with MSVC". STL has no unique implementation and its spec is an exponentially growing bloated mess, so no wonder deleting elements from inside a loop requires senior programmer expertise...
  • mastov
    mastov over 6 years
    @Eugene It does it for all implementations with C++11
  • tartaruga_casco_mole
    tartaruga_casco_mole over 6 years
    @MatthieuM. It does in C++11. In C++17, it takes iterator (const_iterator in C++11) now.
  • iammilind
    iammilind about 6 years
    If it's only condition which matters & doesn't require in-scope initialization or post-operation, then better to use while loop. i.e. for ( ; it != numbers.end(); ) is better visible with while (it != numbers.end())
  • Jesse Chisholm
    Jesse Chisholm about 5 years
    The key being do not trust an old remembered dq.end() value, always compare to a new call to dq.end().
  • Jesse Chisholm
    Jesse Chisholm about 5 years
    Some implementation of gcc 4.8 with c++1y have a bug in erase. it = collection.erase(it); is supposed to work, but it may be safer to use collection.erase(it++);
  • Jesse Chisholm
    Jesse Chisholm about 5 years
    This only works if you will always erase every item. The OP is about selectively erasing the items and still having valid iterators.
  • apple apple
    apple apple almost 3 years
    @kuroineko this would not work on deque because erase invalidate all iterator
  • apple apple
    apple apple almost 3 years
    (referring to 1st snippet, judge by history order)