C++11 std::set lambda comparison function
Solution 1
Yes, a std::function
introduces nearly unavoidable indirection to your set
. While the compiler can always, in theory, figure out that all use of your set
's std::function
involves calling it on a lambda that is always the exact same lambda, that is both hard and extremely fragile.
Fragile, because before the compiler can prove to itself that all calls to that std::function
are actually calls to your lambda, it must prove that no access to your std::set
ever sets the std::function
to anything but your lambda. Which means it has to track down all possible routes to reach your std::set
in all compilation units and prove none of them do it.
This might be possible in some cases, but relatively innocuous changes could break it even if your compiler managed to prove it.
On the other hand, a functor with a stateless operator()
has easy to prove behavior, and optimizations involving that are everyday things.
So yes, in practice I'd suspect std::function
could be slower. On the other hand, std::function
solution is easier to maintain than the make_set
one, and exchanging programmer time for program performance is pretty fungible.
make_set
has the serious disadvantage that any such set
's type must be inferred from the call to make_set
. Often a set
stores persistent state, and not something you create on the stack then let fall out of scope.
If you created a static or global stateless lambda auto MyComp = [](A const&, A const&)->bool { ... }
, you can use the std::set<A, decltype(MyComp)>
syntax to create a set
that can persist, yet is easy for the compiler to optimize (because all instances of decltype(MyComp)
are stateless functors) and inline. I point this out, because you are sticking the set
in a struct
. (Or does your compiler support
struct Foo {
auto mySet = make_set<int>([](int l, int r){ return l<r; });
};
which I would find surprising!)
Finally, if you are worried about performance, consider that std::unordered_set
is much faster (at the cost of being unable to iterate over the contents in order, and having to write/find a good hash), and that a sorted std::vector
is better if you have a 2-phase "insert everything" then "query contents repeatedly". Simply stuff it into the vector
first, then sort
unique
erase
, then use the free equal_range
algorithm.
Solution 2
It's unlikely that the compiler will be able to inline a std::function call, whereas any compiler that supports lambdas would almost certainly inline the functor version, including if that functor is a lambda not hidden by a std::function
.
You could use decltype
to get the lambda's comparator type:
#include <set>
#include <iostream>
#include <iterator>
#include <algorithm>
int main()
{
auto comp = [](int x, int y){ return x < y; };
auto set = std::set<int,decltype(comp)>( comp );
set.insert(1);
set.insert(10);
set.insert(1); // Dupe!
set.insert(2);
std::copy( set.begin(), set.end(), std::ostream_iterator<int>(std::cout, "\n") );
}
Which prints:
1
2
10
See it run live on Coliru.
Solution 3
A stateless lambda (i.e. one with no captures) can decay to a function pointer, so your type could be:
std::set<int, bool (*)(int, int)> numbers;
Otherwise I'd go for the make_set
solution. If you won't use a one-line creation function because it's non-standard you're not going to get much code written!
Solution 4
From my experience playing around with the profiler, the best compromise between performance and beauty is to use a custom delegate implementation, such as:
https://codereview.stackexchange.com/questions/14730/impossibly-fast-delegate-in-c11
As the std::function
is usually a bit too heavy. I can't comment on your specific circumstances, as I don't know them, though.
Solution 5
If you're determined to have the set
as a class member, initializing its comparator at constructor time, then at least one level of indirection is unavoidable. Consider that as far as the compiler knows, you could add another constructor:
Foo () : numbers ([](int x, int y)
{
return x < y;
})
{
}
Foo (char) : numbers ([](int x, int y)
{
return x > y;
})
{
}
Once the you have an object of type Foo
, the type of the set
doesn't carry information on which constructor initialized its comparator, so to call the correct lambda requires an indirection to the run-time selected lambda operator()
.
Since you're using captureless lambdas, you could use the function pointer type bool (*)(int, int)
as your comparator type, as captureless lambdas have the appropriate conversion function. This would of course involve an indirection through the function pointer.
Related videos on Youtube
![cfa45ca55111016ee9269f0a52e771](https://i.stack.imgur.com/vFmBF.jpg?s=256&g=1)
cfa45ca55111016ee9269f0a52e771
Updated on April 27, 2020Comments
-
cfa45ca55111016ee9269f0a52e771 about 4 years
I want to create a
std::set
with a custom comparison function. I could define it as a class withoperator()
, but I wanted to enjoy the ability to define a lambda where it is used, so I decided to define the lambda function in the initialization list of the constructor of the class which has thestd::set
as a member. But I can't get the type of the lambda. Before I proceed, here's an example:class Foo { private: std::set<int, /*???*/> numbers; public: Foo () : numbers ([](int x, int y) { return x < y; }) { } };
I found two solutions after searching: one, using
std::function
. Just have the set comparison function type bestd::function<bool (int, int)>
and pass the lambda exactly like I did. The second solution is to write a make_set function, likestd::make_pair
.SOLUTION 1:
class Foo { private: std::set<int, std::function<bool (int, int)> numbers; public: Foo () : numbers ([](int x, int y) { return x < y; }) { } };
SOLUTION 2:
template <class Key, class Compare> std::set<Key, Compare> make_set (Compare compare) { return std::set<Key, Compare> (compare); }
The question is, do I have a good reason to prefer one solution over the other? I prefer the first one because it makes use of standard features (make_set is not a standard function), but I wonder: does using
std::function
make the code (potentially) slower? I mean, does it lower the chance the compiler inlines the comparison function, or it should be smart enough to behave exactly the same like it would it was a lambda function type and notstd::function
(I know, in this case it can't be a lambda type, but you know, I'm asking in general) ?(I use GCC, but I'd like to know what popular compilers do in general)
SUMMARY, AFTER I GOT LOTS OF GREAT ANSWERS:
If speed is critical, the best solution is to use an class with
operator()
aka functor. It's easiest for the compiler to optimize and avoid any indirections.For easy maintenance and a better general-purpose solution, using C++11 features, use
std::function
. It's still fast (just a little bit slower than the functor, but it may be negligible) and you can use any function -std::function
, lambda, any callable object.There's also an option to use a function pointer, but if there's no speed issue I think
std::function
is better (if you use C++11).There's an option to define the lambda function somewhere else, but then you gain nothing from the comparison function being a lambda expression, since you could as well make it a class with
operator()
and the location of definition wouldn't be the set construction anyway.There are more ideas, such as using delegation. If you want a more thorough explanation of all solutions, read the answers :)
-
Admin over 11 yearsI smell premature optimization.
-
Kerrek SB over 11 yearsWhy not just
bool(*)(int, int)
? But it might be more efficient to make an explicit, default-constructible predicate class. -
cfa45ca55111016ee9269f0a52e771 over 11 years@Fanael how would you know, what if I have a long set of objects rendered by GUI and I really needed it to be as fast as possible
-
Admin over 11 years@fr33domlover: isn't the cost of
std::function
dwarfed by the cost of rendering in that case? -
cfa45ca55111016ee9269f0a52e771 over 11 years@Fanael if sorting is done while there's no rendering, I can still get more speed by making the sorting faster and giving the rendering code more execution time. Anyway, even if this is premature optimization, the question is still useful: look at the answers and upvotes...
-
user1095108 over 11 yearsI've added lambda support to the codereview article. Try it!
-
-
cfa45ca55111016ee9269f0a52e771 over 11 yearsI use GCC, but I'd like to know what popular compilers do in general. The possible indirection is the reason I don't simply pick the std::function solution
-
filmor over 11 yearsstackoverflow.com/questions/8298780/…, that one might be interesting.
-
cfa45ca55111016ee9269f0a52e771 over 11 yearsHmmm... it says compilers still don't fully treat simple cases of std::function
-
cfa45ca55111016ee9269f0a52e771 over 11 yearsYou can only use decltype after defining the lambda, but then I lose the ability to define the lambda in the constructor itself, so I could as well use a class with operator(). I want to define the comparison function in the constructor
-
metal over 11 yearsSee above. You could make it a static member of your class.
-
cfa45ca55111016ee9269f0a52e771 over 11 yearsComment to the edit (code example): look at the code example, it's different. What you suggest doesn't work because a lambda can't be used in unevaluated context + the type can't be deduced correctly because it's compiler-generated for each separate lambda you create. Also I found relevant SO questions, they say exactly that. Otherwise I'd just use the lambda...
-
metal over 11 yearsRight. Was just testing it out. Deleted that bit.
-
cfa45ca55111016ee9269f0a52e771 over 11 yearsLooks like an excellent general-purpose solution, but I just need my little case with std::set to I prefer to just use make_set which is a "special case" of the general-purpose delegate class. But in general it's very interesting, maybe can solve all these lambda issues
-
Yakk - Adam Nevraumont over 11 yearsSuch a delgate still has a layer of opaque indirection (ie, the equivalent of a pointer dereference) over a stateless functor. The ability to use stateless functors is one of the big reasons why
std::sort
outperformsqsort
. -
cfa45ca55111016ee9269f0a52e771 over 11 yearsOh, if it has non-inlined indirection I can as well use std::function and get the same speed...
-
cfa45ca55111016ee9269f0a52e771 over 11 yearsYou're right I could make it a static member, but I could just write a nested class with operator()... no big difference
-
cfa45ca55111016ee9269f0a52e771 over 11 yearsInteresting, I didn't know it casts to a function pointer... As to standatd/non-standard, I meant that if I rely on std::function being better implemented in the future, then future versions of compilers will make it as fast as the lambda, inlined, without me changing the code
-
Jonathan Wakely over 11 yearsI see. There's a limit to how efficient
std::function
can get, but have you actually measured to see if there's a performance problem before you worry about it?std:function
can be pretty damn fast: timj.testbit.eu/2013/01/25/cpp11-signal-system-performance -
metal over 11 yearsGood advice about alternate containers, but beware the hashing function for some other types (
int
s will work fine). That can kill performance if not done right -- either by being too slow or by having too many collisions. -
cfa45ca55111016ee9269f0a52e771 over 11 yearsI think I missed that point with make_set, it wouldn't work with lambdas. That leaves only the std::function solution, which does involve indirection but currently I don't have performance problems with it. Also the vector is very interesting, the set contents are read and rendered by GUI, so rendering happens much more often that users change the contents (which happens rarely)... maybe sorting a vector on each change will actually be faster than key lookup
-
David Rodríguez - dribeas over 11 years@fr33domlover: Your assumption is not necessarily true. The definition of
std::function
requires type erasure on the actual callable entity that is held, and that pretty much requires an indirection. Even in the case of a plain function pointer there's going to be more cost than if you create a functor for that particular purpose. This is exactly the reason whystd::sort
is faster than C'sqsort
. -
user1095108 over 11 yearsI may be inept with
gdb
, but it seemed to me, that the compiler often eliminated all dereferencing when I've used that delegate with -O3. -
Yakk - Adam Nevraumont over 11 yearsThe
auto functor = [](...){...};
syntax has the advantage of being shorter thanstruct functor { bool operator()(...)const{...}; };
syntax, and the disadvantage of requiring thefunctor
instance in order to call it (as opposed to any default constructed functor for thestruct
case). -
AzP almost 11 yearsSadly this doesn't seem to work on VS2012.. For some reason, the decltype for the lambda isn't implemented yet. Using a function<> as the compare-type works though.
-
Euri Pinhollow about 7 years@cfa45ca55111016ee9269f0a52e771 The lambda type only depends on return type and arguments types. You may use any lambda with same returns and args. You may use a small sample lambda to use
decltype
on it: decltype([](bool A, bool B){return bool(1)})`.