Equivalent C++ to Python generator pattern
Solution 1
Generators exist in C++, just under another name: Input Iterators. For example, reading from std::cin
is similar to having a generator of char
.
You simply need to understand what a generator does:
- there is a blob of data: the local variables define a state
- there is an init method
- there is a "next" method
- there is a way to signal termination
In your trivial example, it's easy enough. Conceptually:
struct State { unsigned i, j; };
State make();
void next(State&);
bool isDone(State const&);
Of course, we wrap this as a proper class:
class PairSequence:
// (implicit aliases)
public std::iterator<
std::input_iterator_tag,
std::pair<unsigned, unsigned>
>
{
// C++03
typedef void (PairSequence::*BoolLike)();
void non_comparable();
public:
// C++11 (explicit aliases)
using iterator_category = std::input_iterator_tag;
using value_type = std::pair<unsigned, unsigned>;
using reference = value_type const&;
using pointer = value_type const*;
using difference_type = ptrdiff_t;
// C++03 (explicit aliases)
typedef std::input_iterator_tag iterator_category;
typedef std::pair<unsigned, unsigned> value_type;
typedef value_type const& reference;
typedef value_type const* pointer;
typedef ptrdiff_t difference_type;
PairSequence(): done(false) {}
// C++11
explicit operator bool() const { return !done; }
// C++03
// Safe Bool idiom
operator BoolLike() const {
return done ? 0 : &PairSequence::non_comparable;
}
reference operator*() const { return ij; }
pointer operator->() const { return &ij; }
PairSequence& operator++() {
static unsigned const Max = std::numeric_limts<unsigned>::max();
assert(!done);
if (ij.second != Max) { ++ij.second; return *this; }
if (ij.first != Max) { ij.second = 0; ++ij.first; return *this; }
done = true;
return *this;
}
PairSequence operator++(int) {
PairSequence const tmp(*this);
++*this;
return tmp;
}
private:
bool done;
value_type ij;
};
So hum yeah... might be that C++ is a tad more verbose :)
Solution 2
In C++ there are iterators, but implementing an iterator isn't straightforward: one has to consult the iterator concepts and carefully design the new iterator class to implement them. Thankfully, Boost has an iterator_facade template which should help implementing the iterators and iterator-compatible generators.
Sometimes a stackless coroutine can be used to implement an iterator.
P.S. See also this article which mentions both a switch
hack by Christopher M. Kohlhoff and Boost.Coroutine by Oliver Kowalke. Oliver Kowalke's work is a followup on Boost.Coroutine by Giovanni P. Deretta.
P.S. I think you can also write a kind of generator with lambdas:
std::function<int()> generator = []{
int i = 0;
return [=]() mutable {
return i < 10 ? i++ : -1;
};
}();
int ret = 0; while ((ret = generator()) != -1) std::cout << "generator: " << ret << std::endl;
Or with a functor:
struct generator_t {
int i = 0;
int operator() () {
return i < 10 ? i++ : -1;
}
} generator;
int ret = 0; while ((ret = generator()) != -1) std::cout << "generator: " << ret << std::endl;
P.S. Here's a generator implemented with the Mordor coroutines:
#include <iostream>
using std::cout; using std::endl;
#include <mordor/coroutine.h>
using Mordor::Coroutine; using Mordor::Fiber;
void testMordor() {
Coroutine<int> coro ([](Coroutine<int>& self) {
int i = 0; while (i < 9) self.yield (i++);
});
for (int i = coro.call(); coro.state() != Fiber::TERM; i = coro.call()) cout << i << endl;
}
Solution 3
Since Boost.Coroutine2 now supports it very well (I found it because I wanted to solve exactly the same yield
problem), I am posting the C++ code that matches your original intention:
#include <stdint.h>
#include <iostream>
#include <memory>
#include <boost/coroutine2/all.hpp>
typedef boost::coroutines2::coroutine<std::pair<uint16_t, uint16_t>> coro_t;
void pair_sequence(coro_t::push_type& yield)
{
uint16_t i = 0;
uint16_t j = 0;
for (;;) {
for (;;) {
yield(std::make_pair(i, j));
if (++j == 0)
break;
}
if (++i == 0)
break;
}
}
int main()
{
coro_t::pull_type seq(boost::coroutines2::fixedsize_stack(),
pair_sequence);
for (auto pair : seq) {
print_pair(pair);
}
//while (seq) {
// print_pair(seq.get());
// seq();
//}
}
In this example, pair_sequence
does not take additional arguments. If it needs to, std::bind
or a lambda should be used to generate a function object that takes only one argument (of push_type
), when it is passed to the coro_t::pull_type
constructor.
Solution 4
All answers that involve writing your own iterator are completely wrong. Such answers entirely miss the point of Python generators (one of the language's greatest and unique features). The most important thing about generators is that execution picks up where it left off. This does not happen to iterators. Instead, you must manually store state information such that when operator++ or operator* is called anew, the right information is in place at the very beginning of the next function call. This is why writing your own C++ iterator is a gigantic pain; whereas, generators are elegant, and easy to read+write.
I don't think there is a good analog for Python generators in native C++, at least not yet (there is a rummor that yield will land in C++17). You can get something similarish by resorting to third-party (e.g. Yongwei's Boost suggestion), or rolling your own.
I would say the closest thing in native C++ is threads. A thread can maintain a suspended set of local variables, and can continue execution where it left off, very much like generators, but you need to roll a little bit of additional infrastructure to support communication between the generator object and its caller. E.g.
// Infrastructure
template <typename Element>
class Channel { ... };
// Application
using IntPair = std::pair<int, int>;
void yield_pairs(int end_i, int end_j, Channel<IntPair>* out) {
for (int i = 0; i < end_i; ++i) {
for (int j = 0; j < end_j; ++j) {
out->send(IntPair{i, j}); // "yield"
}
}
out->close();
}
void MyApp() {
Channel<IntPair> pairs;
std::thread generator(yield_pairs, 32, 32, &pairs);
for (IntPair pair : pairs) {
UsePair(pair);
}
generator.join();
}
This solution has several downsides though:
- Threads are "expensive". Most people would consider this to be an "extravagant" use of threads, especially when your generator is so simple.
- There are a couple of clean up actions that you need to remember. These could be automated, but you'd need even more infrastructure, which again, is likely to be seen as "too extravagant". Anyway, the clean ups that you need are:
- out->close()
- generator.join()
- This does not allow you to stop generator. You could make some modifications to add that ability, but it adds clutter to the code. It would never be as clean as Python's yield statement.
- In addition to 2, there are other bits of boilerplate that are needed each time you want to "instantiate" a generator object:
- Channel* out parameter
- Additional variables in main: pairs, generator
Solution 5
Using range-v3:
#include <iostream>
#include <tuple>
#include <range/v3/all.hpp>
using namespace std;
using namespace ranges;
auto generator = [x = view::iota(0) | view::take(3)] {
return view::cartesian_product(x, x);
};
int main () {
for (auto x : generator()) {
cout << get<0>(x) << ", " << get<1>(x) << endl;
}
return 0;
}
Noah Watkins
Updated on November 06, 2021Comments
-
Noah Watkins over 2 years
I've got some example Python code that I need to mimic in C++. I do not require any specific solution (such as co-routine based yield solutions, although they would be acceptable answers as well), I simply need to reproduce the semantics in some manner.
Python
This is a basic sequence generator, clearly too large to store a materialized version.
def pair_sequence(): for i in range(2**32): for j in range(2**32): yield (i, j)
The goal is to maintain two instances of the sequence above, and iterate over them in semi-lockstep, but in chunks. In the example below the
first_pass
uses the sequence of pairs to initialize the buffer, and thesecond_pass
regenerates the same exact sequence and processes the buffer again.def run(): seq1 = pair_sequence() seq2 = pair_sequence() buffer = [0] * 1000 first_pass(seq1, buffer) second_pass(seq2, buffer) ... repeat ...
C++
The only thing I can find for a solution in C++ is to mimic
yield
with C++ coroutines, but I haven't found any good reference on how to do this. I'm also interested in alternative (non general) solutions for this problem. I do not have enough memory budget to keep a copy of the sequence between passes.-
batbaatar over 12 yearsAs you can see from here stackoverflow.com/questions/3864410/… coroutine is not good idea to implement. Can't you do it with buffered reading? stackoverflow.com/questions/4685862/…
-
Admin over 12 yearsC++ iterators should support something like this.
-
Bernhard Barker over 4 yearsRelated: Equivalent in C++ of Yield in C#?
-
-
Noah Watkins over 12 yearsI accepted your answer (thanks!) because it is technically correct for the question i gave. Do you have any pointers for techniques in cases where the sequence that needs to be generated is more complex, or am I just beating a dead horse here with C++ and really coroutines are the only way for generality?
-
Matthieu M. over 12 years@NoahWatkins: coroutines make for an easy implementation when languages support them. Unfortunately C++ does not, so iteration is easier. If you really need coroutines, you actually need a full-blown thread to hold the "stack" of your function call on the side. It's definitely overkill to open such a can of worms just for that in this example, but your mileage may vary depending on your actual needs.
-
boycy over 9 yearsA non-thread-based coroutine implementation is available in Boost boost.org/doc/libs/1_57_0/libs/coroutine/doc/html/index.html with a proposal for standardization here: open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3985.pdf
-
Matthieu M. over 9 years@boycy: There are actually multiple proposal for coroutines, notably one stack-less and the other stack-full. It's tough nut to crack, so for now I am waiting. In the meantime though, stack-less coroutines are implementable near directly as Input Iterators (just, without the sugar).
-
Огњен Шобајић about 8 yearsYet similar, iterators are not the same as generators.
-
acgtyrant over 7 yearsThis practice is out of date, we should use
explicit operator bool()
instead of Safe Bool idiom,using
instead oftypedef
. -
Matthieu M. over 7 years@acgtyrant:
using
vstypedef
is mostly a matter of preference here, though I'll admit to preferringusing
because it lines up the defined names nicely. On the other hand, I totally agree with theexplicit
, from C++11 onward it's definitely the recommendation. I updated the answer, care to give it another review? -
acgtyrant over 7 years@MatthieuM. I think C++03 does not has
nullptr
yet...And you can replacestatic unsigned const
bystatic unsigned constexpr
in C++11 too. -
Matthieu M. over 7 years@acgtyrant: Ugh! Good point on
nullptr
. Not sureconstexpr
gains us much here though, since we do not need a compile-time value. -
Tim Rae over 7 yearsThis code would read MUCH nicer if you split it up into two separate C++03 and C++11 versions... (Or just get rid of C++03 altogether; people shouldn't be writing new code with it)
-
Matthieu M. over 7 years@TimRae: Oh boy, I don't know where you work, but my previous company is still stuck (for most of its code) with a compiler that does still has the experimental
-std=c++0x
flag. Needless to say, migrating to a newer compiler (which has been in the books for a year or two already) is a pre-requisite before being able to actually use C++11 (or later). As for splitting... might be a good idea, though then again there's not much difference. -
Fred Schoen about 7 yearsI implemented this answer and it works fine. In the end, I have a loop like
for(PairSequence s; s; ++s) {...}
. But what is the point of inheriting from std::iterator? How would I use the generator with STL algorithms likefor_each
, for example? -
Matthieu M. about 7 years@FredSchoen: For
for_each
you would need some more work: a constructor for an "end" iterator (setting thedone
boolean totrue
) and a comparison function (==
&!=
) that compares this boolean. Then it should model aninput_iterator
appropriately and thus work withfor_each
. -
EvertW almost 7 yearsDownmarked the answer: iterators are NOT generators. Generators are much more elegant than iterators. Python has iterators as well as generators. C++ is working on coroutines (as in boost::coroutine2 and experimental work in clang) but they will not be included in C++17. Coroutines would allow generators in C++.
-
Matthieu M. almost 7 years@EvertW: From a client point of view, iterators are generators. Of course, they are much more unwieldy to write. Coroutines may require revisiting this answer finally, but I'll hold off until they are standard first... and there might be performance implication in the sugar that may warrant continuing hand-coding the iterators.
-
EvertW almost 7 years@MatthieuM.: True, but the question was about the provider-side of generators, not the client side... Anyway, I am looking forward to co-routines becoming standard. Implementing generators might be a useful bit of sugar coating, but replacing threads by coroutines in cases where this is feasible will be an enormous improvement!
-
Weston over 6 yearsNote that Coroutine2 requires c++11, for which visual studio 2013 is insufficient as it is only partially supported.
-
sudo over 5 yearsMaybe you could abuse macros to get generator syntax in C++. But I won't do that. Sad C++ doesn't have this ability, of all things.
-
Edy about 5 yearsYou are confusing syntax with functionality. A few answers above actually allow the C++ to pick up execution from where it's left off during the last call. It's nothing magical. As a matter of fact, Python is implemented in C, so whatever is possible in Python is possible in C, albeit not as convenient.
-
Kaitain almost 4 years@edy Isn’t that already addressed in the first paragraph? He’s not claiming that equivalent functionality can’t be created in conventional C++, only that it’s “a gigantic pain”.
-
Edy almost 4 years@Kaitain The question here isn't whether it's a pain to do generator in C++, but whether there is a pattern to do so. His claims that the approach "miss the point", that the "closest thing" is threads... are just misleading. Is it a pain? One could read through the other answers and decide for himself.
-
Kaitain almost 4 years@edy But doesn't this end up being a vacuous point, given that all Turing-complete languages are ultimately capable of the same functionality? "Whatever is possible in X is possible in Y" is guaranteed to be true for all such languages, but that doesn't seem to me to be a very illuminating observation.
-
Edy almost 4 years@Kaitain Precisely because all Turing-complete languages supposedly should have the same capability, thus the question of how to implement one feature in another language is legitimate. Nothing that Python has cannot be accomplished by another language; the question is efficiency and maintainability. In both regard, C++ would be a fine(r) choice.
-
Kaitain almost 4 yearsNo, the key question is elegance.
-
Paul over 3 yearsThanks for the example.
std::iterator
is deprecated in C++17, perhaps you'd like to update the code. -
Desmond Gold over 2 yearsc++20 has coroutines but generators were not shipped. (but proposed) you can just create a generator on your own.
-
jlh over 2 yearsThe whole point of Python's generators is the very convenient syntax. So this answer is indeed very correct: There is no equivalent in C++ and other answers do miss the point. Obviously, functionally equivalent things can be done in C++, but I believe that's not what this question is about.
-
Justin Meiners about 2 years@Paul iterators are not deprecated. Is there a source that suggested that?
-
Paul about 2 years@JustinMeiners it says that here, for example: en.cppreference.com/w/cpp/iterator/iterator
-
Justin Meiners about 2 years@Paul it says the Distance type of iterator is deprecated, not the concept. I am only seeing one instance of "deprecated" on the page?
-
Paul about 2 years@JustinMeiners isn't it exactly what you're using in your example? I just mentioned that it's deprecated and that perhaps you'd like to update the code. I didn't say anything about the concept of iterators.