Python - append VS extend efficiency
Solution 1
import timeit
def append2x(foo):
foo.append(1)
foo.append(1)
def extend_lst(foo):
foo.extend([1,1])
def extend_tup(foo):
foo.extend((1,1))
l1 = []
l2 = []
l3 = []
print timeit.timeit('append2x(l1)',setup = 'from __main__ import append2x,l1')
print timeit.timeit('extend_lst(l2)',setup = 'from __main__ import extend_lst,l2')
print timeit.timeit('extend_tup(l3)',setup = 'from __main__ import extend_tup,l3')
Here's a simple benchmark. My results (os-X, 10.5.8, core2duo, FWIW):
0.520906925201 #append
0.602569103241 #extend-list
0.357008934021 #extend-tuple
And the same ordering of the results my linux box (Ubuntu, x86-64 core i7):
0.307395935059 #append
0.319436073303 #extend-list
0.238317012787 #extend-tuple
To me, this says that extend
is quicker than append
, but that creating a list
is relatively expensive compared to creating a tuple
EDIT
Pointed out in the comments below, because of the immutability of tuples, the interpreter can optimize the creation of the tuple out (it creates the tuple once and re-uses it over and over). If we change the code to:
def extend_lst(foo):
v = 1
foo.extend([v,v])
def extend_tup(foo):
v = 1
foo.extend((v,v))
The timings are virtually identical:
0.297003984451 #append
0.344678163528 #extend-list
0.292304992676 #extend-tuple
Although tuple
still consistently beats the list version and barely edges out the append
version for all of the trials I have done.
One thing that I'm taking away from this is that if you're iterating over an object that consists of all literals, choose a tuple
over a list
. If it doesn't consist entirely of literals, then it really doesn't matter whether you choose list
or tuple
.
Solution 2
It is also worth pointing out that the answer to this question hinges on the small size of the list/tuple that is added on each iteration. For larger lists, extend is clearly superior (and lists vs tuples does not make a difference). Starting with mgilson's answer, I checked behaviour for collections with 600 items, rather than 2:
Calling append 600 times takes 8 times as long as using extend()
with a manually defined list/tuple (i.e. [v,v,v,v,v,v,v...]
):
42.4969689846
5.45146393776
5.38034892082
The bulk of these five seconds is actually the list/tuple creation. Preparing it before the timeit
call brings times for extend down to
1.42491698265
0.657584905624
for list and tuple, respectively.
For a more realistic (and fairer) case, one can dynamically generate the data within the function call:
import timeit
def append_loop(foo, reps):
for i in range(reps):
foo.append(i)
def append_comp(foo, reps):
[foo.append(i) for i in range(reps)]
def extend_lst(foo, reps):
foo.extend([i for i in range(reps)])
def extend_tup(foo, reps):
foo.extend((i for i in range(reps)))
repetitions = 600
print timeit.timeit('append_loop([], repetitions)', setup='from __main__ import append_loop, repetitions')
print timeit.timeit('append_comp([], repetitions)', setup='from __main__ import append_comp, repetitions')
print timeit.timeit('extend_lst([], repetitions)', setup='from __main__ import extend_lst, repetitions')
print timeit.timeit('extend_tup([], repetitions)', setup='from __main__ import extend_tup, repetitions')
(Append is implemented both via for-loop and list comprehension to factor out efficiency differences between the two ways of looping.)
Timings are:
53.8211231232
57.1711571217
19.8829259872
28.5986201763
As we can see, extend with list comprehension is still over two times faster than appending. Also, tuple comprehension appears noticeably slower than list comprehension, and append_comp
only introduces unnecessary list creation overhead.
Related videos on Youtube
Yam Mesicka
I teach programming in Israel, trying to teach people to code and make their life better.
Updated on January 31, 2020Comments
-
Yam Mesicka over 4 years
Here is some code that I wrote using Python:
from math import sqrt abundant_list = [] for i in range(12,28123+1): dividor_list = [1] for j in range(2, int(sqrt(i))+1): if i%j == 0: dividor_list.extend([i/j,j]) if sum(dividor_list) > i: abundant_list.append(i) print abundant_list
As you can see, the code is really trying to be efficient as much as possible.
There is any difference if I use
list.append
twice, orlist.extend
just once? I know it can be minor differences, but I would really like to know that :)-
NPE over 11 yearsIf you'd like to know, measure.
-
mgilson over 11 yearsI would be pretty surprised if
extend
wasn't faster than 2.appends
-
Lauritz V. Thaulow over 11 yearsIf I were to optimize this, I'd use the sieve of Eratosthenes to find primes up to
sqrt(28123)
, then for eachi
, I'd factorize it and useitertools.product
to get all ways to combine the factors into divisors, and finally sum those.
-
-
Yam Mesicka over 11 yearsWell, just learned the "time.clock" function, and suprisingly, it's shows that extend is much slower... will edit my message
-
Admin over 11 yearsThe tuple version is faster because the tuple you use is a literal and hence re-used (cf. the bytecode), so it doesn't have to be constructed again and again. Use variables (e.g. pass an object to append to all functions) and the difference will be reduced.
-
Admin over 11 yearsYou'll have to include your timing code, as that's easy to mess up. And since you apparently didn't use
timeit
, you'll have to justify not usingtimeit
;-) -
Steven Rumbalski over 11 yearsmgilson's benchmark suggests
dividor_list.extend((i/j,j))
is more efficient than both because it doesn't create an intermediate list. Creating an intermediate tuple is cheaper. -
Yam Mesicka over 11 yearsGreat answers guys, thanks. Always worth opening new question, always learning new stuff :)
-
mgilson over 11 years@delnan -- Great catch! Although,
tuple
still edges it out after a few repeated runs. -
Admin over 11 years@StevenRumbalski No, it just shows that one
LOAD_CONST
is faster than a 2 to 3LOAD_FAST
s, some other operations, and aBUILD_LIST
;-) -
Steven Rumbalski over 11 years@mgilson: Even with passing in variables the tuple version takes 90% of the individual appends.
-
Steven Rumbalski over 11 years@delnan. Point taken. The tuple version still clocks a little faster on my system with your suggested modifications. But it's too little improvement to really care which is used.
-
mgilson over 11 years@delnan -- I was working on it. I've added an edit explaining this stuff. I don't usually leave the poor timings in the answer, but in this case, I felt it was instructive to leave both versions to show how literal tuples are much faster than literal lists, but when using variables they're relatively close.
-
mgilson over 11 years@Infinity --
time.clock
is not the right way to time a piece of code. There are a lot of subtleties that you need to consider when trying to benchmark things (as you can see in the error of my original answer). Thetimeit
module keeps you from having to worry about 90% of those subtleties -- SO users usually catch the other 10% ;-) -
Yam Mesicka over 11 yearsOK, next time will know to use timeit:) Really appreciate your answers guys, learned a lot from it :)
-
mgilson over 11 years@Infinity -- I learned something from this one too :) -- I should have looked at the bytecode before posting :-P.
-
Steven Rumbalski over 11 yearsA better speedup is to cache the function lookup outside the nested loops:
dividor_list_extend = dividor_list.extend
. -
yoch over 6 yearsThe later (
extend_tup
) is in fact a genexp and not a tuple, which explains the slowness. -
Marc Schulder over 6 yearsYou're right about the output type, my mistake. However, I just tested tuple comprehension and it provides the same speed as the genexp. List comprehension is still faster. Obviously, if the tuple was precomputed the call would be faster, but the same is true for a precomputed list.
-
yoch over 6 yearsYour benchmark is biased, because its include the time needed to built the lists and tuples. Without that, extending a list with a tuple is a bit faster (at least for me, using python 3.5).
-
Zvi over 3 yearsYour test is misleading and the proper test shows that append is the fastest method. Both appends tests that you used are appending an item at a time, but the extend() are for whole list. When I changed the code of the append_comp so the append is out of the comprehension list (as in the last 2 tests) the times are (in my tests on my slow machine using Jupyter lab interpreter) for the last 3 tests: 91, 104, 148 seconds respectfully.
-
Marc Schulder over 3 years@Zvi, I'm not sure what you find misleading. The original question asked whether using append multiple times is faster or slower than using extend once, so that code difference is by design. Your suggested change does not provide the correct result, as it would append the entire reps list as a single item, rather than appending the individual elements, i.e. you would get [[1,2,3...]] instead of [1,2,3...].
-
Marc Schulder over 3 years@yoch I included the list creation as part of the call for fairness, as including the iteration can't be avoided for append. You are correct, though, that the list comprehension accounts for most of the time. On my current setup with Python 3.7, extend with precomputed lists/tuples took 0.9 seconds, vs 17 and 25 seconds for list/generator comprehension. The difference between precomputed list and tuple is less than 0.04 seconds, though, so I'm not convinced that tuples would be consistently faster.