Equivalent C++ to Python generator pattern

65,224

Solution 1

Generators exist in C++, just under another name: Input Iterators. For example, reading from std::cin is similar to having a generator of char.

You simply need to understand what a generator does:

  • there is a blob of data: the local variables define a state
  • there is an init method
  • there is a "next" method
  • there is a way to signal termination

In your trivial example, it's easy enough. Conceptually:

struct State { unsigned i, j; };

State make();

void next(State&);

bool isDone(State const&);

Of course, we wrap this as a proper class:

class PairSequence:
    // (implicit aliases)
    public std::iterator<
        std::input_iterator_tag,
        std::pair<unsigned, unsigned>
    >
{
  // C++03
  typedef void (PairSequence::*BoolLike)();
  void non_comparable();
public:
  // C++11 (explicit aliases)
  using iterator_category = std::input_iterator_tag;
  using value_type = std::pair<unsigned, unsigned>;
  using reference = value_type const&;
  using pointer = value_type const*;
  using difference_type = ptrdiff_t;

  // C++03 (explicit aliases)
  typedef std::input_iterator_tag iterator_category;
  typedef std::pair<unsigned, unsigned> value_type;
  typedef value_type const& reference;
  typedef value_type const* pointer;
  typedef ptrdiff_t difference_type;

  PairSequence(): done(false) {}

  // C++11
  explicit operator bool() const { return !done; }

  // C++03
  // Safe Bool idiom
  operator BoolLike() const {
    return done ? 0 : &PairSequence::non_comparable;
  }

  reference operator*() const { return ij; }
  pointer operator->() const { return &ij; }

  PairSequence& operator++() {
    static unsigned const Max = std::numeric_limts<unsigned>::max();

    assert(!done);

    if (ij.second != Max) { ++ij.second; return *this; }
    if (ij.first != Max) { ij.second = 0; ++ij.first; return *this; }

    done = true;
    return *this;
  }

  PairSequence operator++(int) {
    PairSequence const tmp(*this);
    ++*this;
    return tmp;
  }

private:
  bool done;
  value_type ij;
};

So hum yeah... might be that C++ is a tad more verbose :)

Solution 2

In C++ there are iterators, but implementing an iterator isn't straightforward: one has to consult the iterator concepts and carefully design the new iterator class to implement them. Thankfully, Boost has an iterator_facade template which should help implementing the iterators and iterator-compatible generators.

Sometimes a stackless coroutine can be used to implement an iterator.

P.S. See also this article which mentions both a switch hack by Christopher M. Kohlhoff and Boost.Coroutine by Oliver Kowalke. Oliver Kowalke's work is a followup on Boost.Coroutine by Giovanni P. Deretta.

P.S. I think you can also write a kind of generator with lambdas:

std::function<int()> generator = []{
  int i = 0;
  return [=]() mutable {
    return i < 10 ? i++ : -1;
  };
}();
int ret = 0; while ((ret = generator()) != -1) std::cout << "generator: " << ret << std::endl;

Or with a functor:

struct generator_t {
  int i = 0;
  int operator() () {
    return i < 10 ? i++ : -1;
  }
} generator;
int ret = 0; while ((ret = generator()) != -1) std::cout << "generator: " << ret << std::endl;

P.S. Here's a generator implemented with the Mordor coroutines:

#include <iostream>
using std::cout; using std::endl;
#include <mordor/coroutine.h>
using Mordor::Coroutine; using Mordor::Fiber;

void testMordor() {
  Coroutine<int> coro ([](Coroutine<int>& self) {
    int i = 0; while (i < 9) self.yield (i++);
  });
  for (int i = coro.call(); coro.state() != Fiber::TERM; i = coro.call()) cout << i << endl;
}

Solution 3

Since Boost.Coroutine2 now supports it very well (I found it because I wanted to solve exactly the same yield problem), I am posting the C++ code that matches your original intention:

#include <stdint.h>
#include <iostream>
#include <memory>
#include <boost/coroutine2/all.hpp>

typedef boost::coroutines2::coroutine<std::pair<uint16_t, uint16_t>> coro_t;

void pair_sequence(coro_t::push_type& yield)
{
    uint16_t i = 0;
    uint16_t j = 0;
    for (;;) {
        for (;;) {
            yield(std::make_pair(i, j));
            if (++j == 0)
                break;
        }
        if (++i == 0)
            break;
    }
}

int main()
{
    coro_t::pull_type seq(boost::coroutines2::fixedsize_stack(),
                          pair_sequence);
    for (auto pair : seq) {
        print_pair(pair);
    }
    //while (seq) {
    //    print_pair(seq.get());
    //    seq();
    //}
}

In this example, pair_sequence does not take additional arguments. If it needs to, std::bind or a lambda should be used to generate a function object that takes only one argument (of push_type), when it is passed to the coro_t::pull_type constructor.

Solution 4

All answers that involve writing your own iterator are completely wrong. Such answers entirely miss the point of Python generators (one of the language's greatest and unique features). The most important thing about generators is that execution picks up where it left off. This does not happen to iterators. Instead, you must manually store state information such that when operator++ or operator* is called anew, the right information is in place at the very beginning of the next function call. This is why writing your own C++ iterator is a gigantic pain; whereas, generators are elegant, and easy to read+write.

I don't think there is a good analog for Python generators in native C++, at least not yet (there is a rummor that yield will land in C++17). You can get something similarish by resorting to third-party (e.g. Yongwei's Boost suggestion), or rolling your own.

I would say the closest thing in native C++ is threads. A thread can maintain a suspended set of local variables, and can continue execution where it left off, very much like generators, but you need to roll a little bit of additional infrastructure to support communication between the generator object and its caller. E.g.

// Infrastructure

template <typename Element>
class Channel { ... };

// Application

using IntPair = std::pair<int, int>;

void yield_pairs(int end_i, int end_j, Channel<IntPair>* out) {
  for (int i = 0; i < end_i; ++i) {
    for (int j = 0; j < end_j; ++j) {
      out->send(IntPair{i, j});  // "yield"
    }
  }
  out->close();
}

void MyApp() {
  Channel<IntPair> pairs;
  std::thread generator(yield_pairs, 32, 32, &pairs);
  for (IntPair pair : pairs) {
    UsePair(pair);
  }
  generator.join();
}

This solution has several downsides though:

  1. Threads are "expensive". Most people would consider this to be an "extravagant" use of threads, especially when your generator is so simple.
  2. There are a couple of clean up actions that you need to remember. These could be automated, but you'd need even more infrastructure, which again, is likely to be seen as "too extravagant". Anyway, the clean ups that you need are:
    1. out->close()
    2. generator.join()
  3. This does not allow you to stop generator. You could make some modifications to add that ability, but it adds clutter to the code. It would never be as clean as Python's yield statement.
  4. In addition to 2, there are other bits of boilerplate that are needed each time you want to "instantiate" a generator object:
    1. Channel* out parameter
    2. Additional variables in main: pairs, generator

Solution 5

Using range-v3:

#include <iostream>
#include <tuple>
#include <range/v3/all.hpp>

using namespace std;
using namespace ranges;

auto generator = [x = view::iota(0) | view::take(3)] {
    return view::cartesian_product(x, x);
};

int main () {
    for (auto x : generator()) {
        cout << get<0>(x) << ", " << get<1>(x) << endl;
    }

    return 0;
}
Share:
65,224
Noah Watkins
Author by

Noah Watkins

Updated on November 06, 2021

Comments

  • Noah Watkins
    Noah Watkins over 2 years

    I've got some example Python code that I need to mimic in C++. I do not require any specific solution (such as co-routine based yield solutions, although they would be acceptable answers as well), I simply need to reproduce the semantics in some manner.

    Python

    This is a basic sequence generator, clearly too large to store a materialized version.

    def pair_sequence():
        for i in range(2**32):
            for j in range(2**32):
                yield (i, j)
    

    The goal is to maintain two instances of the sequence above, and iterate over them in semi-lockstep, but in chunks. In the example below the first_pass uses the sequence of pairs to initialize the buffer, and the second_pass regenerates the same exact sequence and processes the buffer again.

    def run():
        seq1 = pair_sequence()
        seq2 = pair_sequence()
    
        buffer = [0] * 1000
        first_pass(seq1, buffer)
        second_pass(seq2, buffer)
        ... repeat ...
    

    C++

    The only thing I can find for a solution in C++ is to mimic yield with C++ coroutines, but I haven't found any good reference on how to do this. I'm also interested in alternative (non general) solutions for this problem. I do not have enough memory budget to keep a copy of the sequence between passes.

  • Noah Watkins
    Noah Watkins over 12 years
    I accepted your answer (thanks!) because it is technically correct for the question i gave. Do you have any pointers for techniques in cases where the sequence that needs to be generated is more complex, or am I just beating a dead horse here with C++ and really coroutines are the only way for generality?
  • Matthieu M.
    Matthieu M. over 12 years
    @NoahWatkins: coroutines make for an easy implementation when languages support them. Unfortunately C++ does not, so iteration is easier. If you really need coroutines, you actually need a full-blown thread to hold the "stack" of your function call on the side. It's definitely overkill to open such a can of worms just for that in this example, but your mileage may vary depending on your actual needs.
  • boycy
    boycy over 9 years
    A non-thread-based coroutine implementation is available in Boost boost.org/doc/libs/1_57_0/libs/coroutine/doc/html/index.html with a proposal for standardization here: open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3985.pdf
  • Matthieu M.
    Matthieu M. over 9 years
    @boycy: There are actually multiple proposal for coroutines, notably one stack-less and the other stack-full. It's tough nut to crack, so for now I am waiting. In the meantime though, stack-less coroutines are implementable near directly as Input Iterators (just, without the sugar).
  • Огњен Шобајић
    Огњен Шобајић about 8 years
    Yet similar, iterators are not the same as generators.
  • acgtyrant
    acgtyrant over 7 years
    This practice is out of date, we should use explicit operator bool() instead of Safe Bool idiom, using instead of typedef.
  • Matthieu M.
    Matthieu M. over 7 years
    @acgtyrant: using vs typedef is mostly a matter of preference here, though I'll admit to preferring using because it lines up the defined names nicely. On the other hand, I totally agree with the explicit, from C++11 onward it's definitely the recommendation. I updated the answer, care to give it another review?
  • acgtyrant
    acgtyrant over 7 years
    @MatthieuM. I think C++03 does not has nullptr yet...And you can replace static unsigned const by static unsigned constexpr in C++11 too.
  • Matthieu M.
    Matthieu M. over 7 years
    @acgtyrant: Ugh! Good point on nullptr. Not sure constexpr gains us much here though, since we do not need a compile-time value.
  • Tim Rae
    Tim Rae over 7 years
    This code would read MUCH nicer if you split it up into two separate C++03 and C++11 versions... (Or just get rid of C++03 altogether; people shouldn't be writing new code with it)
  • Matthieu M.
    Matthieu M. over 7 years
    @TimRae: Oh boy, I don't know where you work, but my previous company is still stuck (for most of its code) with a compiler that does still has the experimental -std=c++0x flag. Needless to say, migrating to a newer compiler (which has been in the books for a year or two already) is a pre-requisite before being able to actually use C++11 (or later). As for splitting... might be a good idea, though then again there's not much difference.
  • Fred Schoen
    Fred Schoen about 7 years
    I implemented this answer and it works fine. In the end, I have a loop like for(PairSequence s; s; ++s) {...}. But what is the point of inheriting from std::iterator? How would I use the generator with STL algorithms like for_each, for example?
  • Matthieu M.
    Matthieu M. about 7 years
    @FredSchoen: For for_each you would need some more work: a constructor for an "end" iterator (setting the done boolean to true) and a comparison function (== & !=) that compares this boolean. Then it should model an input_iterator appropriately and thus work with for_each.
  • EvertW
    EvertW almost 7 years
    Downmarked the answer: iterators are NOT generators. Generators are much more elegant than iterators. Python has iterators as well as generators. C++ is working on coroutines (as in boost::coroutine2 and experimental work in clang) but they will not be included in C++17. Coroutines would allow generators in C++.
  • Matthieu M.
    Matthieu M. almost 7 years
    @EvertW: From a client point of view, iterators are generators. Of course, they are much more unwieldy to write. Coroutines may require revisiting this answer finally, but I'll hold off until they are standard first... and there might be performance implication in the sugar that may warrant continuing hand-coding the iterators.
  • EvertW
    EvertW almost 7 years
    @MatthieuM.: True, but the question was about the provider-side of generators, not the client side... Anyway, I am looking forward to co-routines becoming standard. Implementing generators might be a useful bit of sugar coating, but replacing threads by coroutines in cases where this is feasible will be an enormous improvement!
  • Weston
    Weston over 6 years
    Note that Coroutine2 requires c++11, for which visual studio 2013 is insufficient as it is only partially supported.
  • sudo
    sudo over 5 years
    Maybe you could abuse macros to get generator syntax in C++. But I won't do that. Sad C++ doesn't have this ability, of all things.
  • Edy
    Edy about 5 years
    You are confusing syntax with functionality. A few answers above actually allow the C++ to pick up execution from where it's left off during the last call. It's nothing magical. As a matter of fact, Python is implemented in C, so whatever is possible in Python is possible in C, albeit not as convenient.
  • Kaitain
    Kaitain almost 4 years
    @edy Isn’t that already addressed in the first paragraph? He’s not claiming that equivalent functionality can’t be created in conventional C++, only that it’s “a gigantic pain”.
  • Edy
    Edy almost 4 years
    @Kaitain The question here isn't whether it's a pain to do generator in C++, but whether there is a pattern to do so. His claims that the approach "miss the point", that the "closest thing" is threads... are just misleading. Is it a pain? One could read through the other answers and decide for himself.
  • Kaitain
    Kaitain almost 4 years
    @edy But doesn't this end up being a vacuous point, given that all Turing-complete languages are ultimately capable of the same functionality? "Whatever is possible in X is possible in Y" is guaranteed to be true for all such languages, but that doesn't seem to me to be a very illuminating observation.
  • Edy
    Edy almost 4 years
    @Kaitain Precisely because all Turing-complete languages supposedly should have the same capability, thus the question of how to implement one feature in another language is legitimate. Nothing that Python has cannot be accomplished by another language; the question is efficiency and maintainability. In both regard, C++ would be a fine(r) choice.
  • Kaitain
    Kaitain almost 4 years
    No, the key question is elegance.
  • Paul
    Paul over 3 years
    Thanks for the example. std::iterator is deprecated in C++17, perhaps you'd like to update the code.
  • Desmond Gold
    Desmond Gold over 2 years
    c++20 has coroutines but generators were not shipped. (but proposed) you can just create a generator on your own.
  • jlh
    jlh over 2 years
    The whole point of Python's generators is the very convenient syntax. So this answer is indeed very correct: There is no equivalent in C++ and other answers do miss the point. Obviously, functionally equivalent things can be done in C++, but I believe that's not what this question is about.
  • Justin Meiners
    Justin Meiners about 2 years
    @Paul iterators are not deprecated. Is there a source that suggested that?
  • Paul
    Paul about 2 years
    @JustinMeiners it says that here, for example: en.cppreference.com/w/cpp/iterator/iterator
  • Justin Meiners
    Justin Meiners about 2 years
    @Paul it says the Distance type of iterator is deprecated, not the concept. I am only seeing one instance of "deprecated" on the page?
  • Paul
    Paul about 2 years
    @JustinMeiners isn't it exactly what you're using in your example? I just mentioned that it's deprecated and that perhaps you'd like to update the code. I didn't say anything about the concept of iterators.