python map a lambda function to a list

17,809

Solution 1

because the lambda function needs to be created len(data) times, thus inefficient.

Not true, in the example the lambda definition is evaluated only once at compile time and not len(data) times - there is no need to assign it to a name for performance reasons. Look at Sergey's answer, he proves lambda is not expensive at all for this case.

If you do want to give it a name for the sake of clarity, you should just use a def statement instead. Assigning a lambda to a name is considered bad style: according to PEP-8 Programming Recommendations you should "Always use a def statement instead of an assignment statement that binds a lambda expression directly to an identifier". Quoting from the official style guide:

Yes:

def f(x): return 2*x

No:

f = lambda x: 2*x:

The only difference between lambda and the one-line def is that def will give it a name (probably an extra LOAD_CONST):

>>> import dis

>>> def _(x):
        return f(x, 30)

>>> dis.dis(_)
  2           0 LOAD_GLOBAL              0 (f)
              2 LOAD_FAST                0 (x)
              4 LOAD_CONST               1 (30)
              6 CALL_FUNCTION            2
              8 RETURN_VALUE

>>> dis.dis(lambda x: f(x, 30))
  1           0 LOAD_GLOBAL              0 (f)
              2 LOAD_FAST                0 (x)
              4 LOAD_CONST               1 (30)
              6 CALL_FUNCTION            2
              8 RETURN_VALUE

As you can see above, both forms compile to the same bytecode.

The lisp inspired functions map, filter and reduce always felt a bit alien in Python. Since the introduction of list comprehensions (at version 2.0 IINM) they became the idiomatic way to achieve the same result. So this:

new_data = map(lambda x: f(x, 30), data)

Is often written as:

new_data = [f(x, 30) for x in data]

If data is big and you are just iterating over it, generator expressions trade memory for cpu:

for value in (f(x, 30) for x in data):
    do_something_with(value)

The lispy constructs like map, filter and reduce are likely to be retired (moved to the functools module) and I recommend the use of list comprehensions and generator expressions in new code.

Last, Python is surprisingly counterintuitive regarding performance. You should always profile in order to put your beliefs about performance in check.

Bottom line: never worry about "optimizing" a damn thing until you have profiled it and know for sure it's a relevant bottleneck.

Solution 2

Lambda creates only once when map calls

In [20]: l = list(range(100000))

In [21]: %timeit list(map(lambda x: x * 2, l))
100 loops, best of 3: 13.8 ms per loop

In [22]: g = lambda x: x * 2

In [23]: %timeit list(map(g, l))
100 loops, best of 3: 13.8 ms per loop

As you can see, the execution time is not changed.

Share:
17,809
nos
Author by

nos

Hello fellow programmers! I maintain open source software and blog about science and technology in my spare time. Main projects: gita: A command-line tool to manage multiple git repos (1000+ ⭐) blog where I write about math, physics, coding, and hobbies youtube channel: productivity hacks and coding tips

Updated on July 24, 2022

Comments

  • nos
    nos over 1 year

    I have the impression that the following code pattern is not good

    new_data = map(lambda x: f(x, 30), data)
    

    because the lambda function needs to be created len(data) times, thus inefficient. In that case, would the following workaround help?

    g = lambda x: f(x, 30)
    new_data = map(g, data)
    

    Also, would replacing the lambda function with partial help with speed, given data is big?