distance between std::set begin() and std::set iterator in O(logn)
Solution 1
You can use sorted std::vector<int>
. If it is sorted, you can find element in O(log n)
. And you can find distance in constant time O(1)
.
By sorted vector I mean that after every insertion (or after many insertions) you do std::sort(v.begin(), v.end());
If your type inside std::set<T>
is not as light like int
- you can keep both - std::set<T>
and sorted vector of iterators std::vector<std::set<T>::iterator>
. But it could not trivial to keep these structures in sync. Maybe you can add some like position to T
? Or keep std::set<std::pair<T,int>, comp_first_of_pair<T>>
where comp_first_of_pair
is just to have set
sorted only by T
and the second int
is for keeping position in set?
Just a few ideas - to have even O(1)
distance time...
Solution 2
You can find the index of an element in a set in O(log(N)) with an ordered set: https://www.geeksforgeeks.org/ordered-set-gnu-c-pbds/ . This is implemented as a red-black tree. I know this topic is very old, but it might help readers in the future.
Solution 3
You can use the function std::set<>::find
to search for an element x
and compute the distance to the first iterator of the set.
std::distance(s.begin(), s.find(x))
However, as comments indicate the runtime of distance depends on the type of iterator used. In the case of a set this is a bidirectional iterator and distance is O(n).
Solution 4
If computing the index is really your bottleneck, then I see 2 options:
- Store the index. Either in the nodes themselves or in a separate
std::map
. Of course this means you have to keep this cache updated. - Use a
std::vector
. That is not as bad as it might look at first. If you keep the vector always sorted you can use it like aset
. The performance will be similar toset
. The biggest drawback is: the node might be copied a lot. (This can be compensated by using pointers,boost:shared_ptr
orstd::unique_ptr
[c++11 only])
To look up an element you usestd::lower_bound
.
Instead of insert/push_back you do:insert( lower_bound(b,e,x), x )
Solution 5
You cannot use matematics with bidirectional iterators. So only acceptable way is to count by yourself (how many int less than X you inserted into set).
But, if you have cleanly separated "data collection" and "data usage" stages - probably it is worth to replace std::set with sorted std::vector. Its harder to maintain, but have own benefits, including iterator matematics (so you can get search with O(log n) with std::binary_search and the distance with O(1) )
divanshu
An avid learner of algorithms. Interest in Application development.
Updated on August 07, 2022Comments
-
divanshu over 1 year
I need to find the index of an element in std::set. This index can be visualized as the distance of the iterator from the beginning. One way can be:
for(int i = 0, set<int>::iterator it = s.begin(); it != iteratorToBeFound; ++it, ++i);
This clearly takes O(n) time. But we know that the distance from the root in a binary search tree as implemented by set internally can be found in O(log n) time.
Is their any way to implement the same to find the index in O(log n) time in C++ set?
-
Xeo about 11 yearsThat's
O(log n + m)
, though. But the best you can do, AFAIK. -
juanchopanza about 11 yearsBut std::distance is O(N) here.
-
divanshu about 11 yearsI know about std::distance but this is implemented the same way as written in the question and definitely is O(n).
-
divanshu about 11 yearsBut sorting after every insertion in std::vector would cost me O(nlogn). Where's the advantage?
-
PiotrNycz about 11 years1) You can sort only after a series of consecutive insertions. 2) Cost of insertion in
std::set<>
isO(log n)
- n insertions:O(n Log n)
. 3) Maybe youinsert
once - but testing distance many times.... -
chb over 4 yearsRather than link to an external resource, please excerpt the relevant portion of it and include it in your answer.
-
Rithik Singh over 4 yearsinsert is a O(n) function in vector
-
vSzemkel about 2 yearsAnd remember to use specialized version of lower_bound for containers that implements it. Newer use std::lower_bound for ordered (multi)map and (multi)sets
-
Chris Vilches almost 2 yearsVery good code, and I could solve one problem I had where the bottleneck was
std::set
'sdistance
being O(N). This worked like magic!