What is the 'pythonic' equivalent to the 'fold' function from functional programming?

124,191

Solution 1

The Pythonic way of summing an array is using sum. For other purposes, you can sometimes use some combination of reduce (from the functools module) and the operator module, e.g.:

def product(xs):
    return reduce(operator.mul, xs, 1)

Be aware that reduce is actually a foldl, in Haskell terms. There is no special syntax to perform folds, there's no builtin foldr, and actually using reduce with non-associative operators is considered bad style.

Using higher-order functions is quite pythonic; it makes good use of Python's principle that everything is an object, including functions and classes. You are right that lambdas are frowned upon by some Pythonistas, but mostly because they tend not to be very readable when they get complex.

Solution 2

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), which gives the possibility to name the result of an expression, we can use a list comprehension to replicate what other languages call fold/foldleft/reduce operations:

Given a list, a reducing function and an accumulator:

items = [1, 2, 3, 4, 5]
f = lambda acc, x: acc * x
accumulator = 1

we can fold items with f in order to obtain the resulting accumulation:

[accumulator := f(accumulator, x) for x in items]
# accumulator = 120

or in a condensed formed:

acc = 1; [acc := acc * x for x in [1, 2, 3, 4, 5]]
# acc = 120

Note that this is actually also a "scanleft" operation as the result of the list comprehension represents the state of the accumulation at each step:

acc = 1
scanned = [acc := acc * x for x in [1, 2, 3, 4, 5]]
# scanned = [1, 2, 6, 24, 120]
# acc = 120

Solution 3

Haskell

foldl (+) 0 [1,2,3,4,5]

Python

reduce(lambda a,b: a+b, [1,2,3,4,5], 0)

Obviously, that is a trivial example to illustrate a point. In Python you would just do sum([1,2,3,4,5]) and even Haskell purists would generally prefer sum [1,2,3,4,5].

For non-trivial scenarios when there is no obvious convenience function, the idiomatic pythonic approach is to explicitly write out the for loop and use mutable variable assignment instead of using reduce or a fold.

That is not at all the functional style, but that is the "pythonic" way. Python is not designed for functional purists. See how Python favors exceptions for flow control to see how non-functional idiomatic python is.

Solution 4

In Python 3, the reduce has been removed: Release notes. Nevertheless you can use the functools module

import operator, functools
def product(xs):
    return functools.reduce(operator.mul, xs, 1)

On the other hand, the documentation expresses preference towards for-loop instead of reduce, hence:

def product(xs):
    result = 1
    for i in xs:
        result *= i
    return result

Solution 5

Not really answer to the question, but one-liners for foldl and foldr:

a = [8,3,4]

## Foldl
reduce(lambda x,y: x**y, a)
#68719476736

## Foldr
reduce(lambda x,y: y**x, a[::-1])
#14134776518227074636666380005943348126619871175004951664972849610340958208L
Share:
124,191

Related videos on Youtube

mistertim
Author by

mistertim

I'm a Software Engineer based in London, contracting at various places. My interests and competencies include: Functional programming, Information Retrieval, Machine Learning, NLP, and Agent-based Simulation. Technologies I like, or know a bit about include : Haskell, Scala, Java, Ruby, R, Postgresql, Redis, Akka, Erlang, Hadoop, Mahout.

Updated on September 21, 2020

Comments

  • mistertim
    mistertim over 3 years

    What is the most idiomatic way to achieve something like the following, in Haskell:

    foldl (+) 0 [1,2,3,4,5]
    --> 15
    

    Or its equivalent in Ruby:

    [1,2,3,4,5].inject(0) {|m,x| m + x}
    #> 15
    

    Obviously, Python provides the reduce function, which is an implementation of fold, exactly as above, however, I was told that the 'pythonic' way of programming was to avoid lambda terms and higher-order functions, preferring list-comprehensions where possible. Therefore, is there a preferred way of folding a list, or list-like structure in Python that isn't the reduce function, or is reduce the idiomatic way of achieving this?

    • JBernardo
      JBernardo about 12 years
      sum isn't good enough?
    • jamylak
      jamylak about 12 years
      not sure if this is a good example for your question. It can easily be achieved with sum, you may want to provide some different types of examples.
    • mistertim
      mistertim about 12 years
      Hey JBernardo - Summing over a list of numbers was meant as a rather degenerate example, I'm more interested in the general idea of accumulating the elements of a list using some binary operation, and a starting value, not summing integers specifically.
    • Joel Cornett
      Joel Cornett about 12 years
      @mistertim: sum() actually provides limited functionality with this. sum([[a], [b, c, d], [e, f]], []) returns [a, b, c, d, e, f] for example.
    • lvc
      lvc almost 11 years
      Although the case of doing it with lists is a good demonstration of things to watch for with this technique - + on lists is a linear time operation in both time and memory, making the whole call quadratic. Using list(itertools.chain.from_iterable([a], [b,c,d],[e,f],[]]) is linear overall - and if you only need to iterate over it once, you can drop the call to list to make it constant in terms of memory.
    • dividebyzero
      dividebyzero over 7 years
      It should be noted foldLeft is generally something different from reduce... You can only reduce on monoids. foldLeft is a much more general thing.
  • JBernardo
    JBernardo about 12 years
    Higher-order functions may not always be pythonic. Otherwise why would reduce be removed from the Py3K builtin namespace?
  • Fred Foo
    Fred Foo about 12 years
    @JBernardo: you're saying that anything not in the builtins module is not pythonic?
  • jamylak
    jamylak about 12 years
    This would not be considered pythonic for an example such as this one.
  • JBernardo
    JBernardo about 12 years
    No, that would be stupid to say. But give me a single reason why do you think GvR would hate so much the reduce function at the point of removing it from builtins?
  • Fred Foo
    Fred Foo about 12 years
    Python loops are notoriously slow. Using (or abusing) reduce is a common way of optimizing a Python program.
  • JBernardo
    JBernardo about 12 years
    @larsmans Please, don't come to say reduce is faster than a simple loop... It will have always a function call overhead for each iteration. Also, again, Pypy can optimize loops to C speed
  • Fred Foo
    Fred Foo about 12 years
    @JBernardo: because people try to play too smart tricks with it. To quote from that blog post, "the applicability of reduce() is pretty much limited to associative operators, and in all other cases it's better to write out the accumulation loop explicitly." So, its use is limited, but even GvR apparently had to admit its useful enough to keep it in the standard library.
  • Fred Foo
    Fred Foo about 12 years
    @JBernardo: yes, that's what I'm claiming. I just profiled my version of product against one in your style, and it's faster (marginally, though).
  • Admin
    Admin about 12 years
    @JBernardo Assuming a builtin function (like operator.add) as argument to reduce: That extra call is a C call (which is much cheaper than a Python call), and it saves dispatching and interpreting a couple of bytecode instructions, which can easily cause dozens of function calls.
  • Wes
    Wes about 12 years
    @JBernardo, so does that mean that every usage of fold in Haskell and Scheme is equally bad? It's just a different style of programming, ignoring it and putting your fingers in your ears and saying it's unclear doesn't make it so. Like most things that are a different style it takes practice to get used to it. The idea is to put things into general categories so it's easier to reason about programs. "Oh I want to do this, hmm, looks like a fold" (or a map, or an unfold, or an unfold then a fold over that)
  • Fred Foo
    Fred Foo about 12 years
    @Wes: I was thinking primarily of functions passed to reduce or map that perform side-effects, or that take and return long tuples to perform too many computations at once. In Haskell, using foldl may sometimes the only reasonable option, but even in Scheme, there is a point where the loop gets so complicated that you want a named let instead.
  • Wes
    Wes about 12 years
    @larsmans: I personally try to avoid using lambdas in both Scheme and Haskell because I think it makes things less readable. On this point I agree 100% with pythonistas.
  • KayEss
    KayEss almost 11 years
    You swap the arguments to f around in your recursive case.
  • cemper93
    cemper93 over 9 years
    Because Python lacks tail recursion, this will break on longer lists and is wasteful. Furthermore, this is not truly the "fold" function, but merely a left fold, i.e. foldl, that is, exactly what reduce already offers (note that reduce's function signature is reduce(function, sequence[, initial]) -> value - it, too, includes the functionality of giving an initial value for the accumulator).
  • golem
    golem about 9 years
    Lambda in Python can't contain more than one expression. You can't make it complex even if you try hard. So Pythonistas who don't like them are probably just not used to and hence don't like functional programming style.
  • John 9631
    John 9631 almost 7 years
    For any current readers, the preference is list comprehensions or generator expressions. They are faster than the loop above. Should speed be a concern, pypy will usually give a 3-5x speedup (not C speed I'm afraid) but numba and cython can both approach C speed with the right data structures.
  • itsbruce
    itsbruce over 6 years
    @John9631 List comprehensions or generators would be a bad substitute for a general-purpose fold in pretty much any language, certainly Haskell and Python. Folds carry an accumulator, which is very bad style in Python comprehensions and impossible in Haskell. In Python, if you want accumulated state, the orthodox way to do it is a for loop, not a comprehension.
  • itsbruce
    itsbruce over 6 years
    folds are useful to more than functional "purists". They are general purpose abstractions. Recursive problems are pervasive in computing. Folds offer a way to remove the boilerplate and a way to make recursive solutions safe in languages which don't natively support recursion. So a very practical thing. GvR's prejudices in this area are unfortunate.
  • clay
    clay over 6 years
    reduce wasn't removed from the Python 3 standard library. reduce moved to the functools module as you show.
  • Kyr
    Kyr over 6 years
    @clay, I just took the phrase from Guido's release notes, but you may be right :)
  • Mateen Ulhaq
    Mateen Ulhaq over 5 years
    I think this is a better way to write your foldr: reduce(lambda y, x: x**y, reversed(a)). It now has a more natural usage, works with iterators, and consumes less memory.
  • Scott Skiles
    Scott Skiles about 5 years
    Hi rq_! I think your answer would be improved and add a great deal if you gave a non-trivial example of fold that is difficult to do cleanly in Python, and then "fold" that in Python :-)
  • iono
    iono about 3 years
    It's utterly bizarre to me that JavaScript has syntactically cleaner and more useful lambdas and higher-order functions than Python. This is really upsetting; Python is otherwise such a well-designed and attractive language.
  • dark_ruby
    dark_ruby over 2 years
    this should be an accepted answer and not a suggestion to use sum
  • Roman Kogan
    Roman Kogan over 2 years
    (Attn: @mistertim) OK, so this is the actual answer. I am so sad that this not the top answer, and that many other pages say that the pythonic analog of reduce is, well, reduce() For the sake of Googlability: the pythonic analog of reduce is using := to introduce an accumulator into a list comprehension. Please change the "accepted answer" to this
  • D0SBoots
    D0SBoots almost 2 years
    This isn't truly a fold, though. As noted, it's a "scanleft" that's throwing away the temporary list after. For a long list, that difference could matter.
  • D0SBoots
    D0SBoots almost 2 years
    Why this can matter: >>> timeit.Timer("acc=0\nfor i in range(10000):acc+=i").repeat(5, 100) [0.036283817142248154, 0.032444920390844345, 0.03235280327498913, 0.032462552189826965, 0.03250854276120663] >>> timeit.Timer("acc=0\n[acc:=acc+i for i in range(10000)]").repeat(5, 100) [0.04360721819102764, 0.03987771272659302, 0.03988228552043438, 0.03988049365580082, 0.039858657866716385]