What is the fastest way to merge two lists in python?
Solution 1
You can just use concatenation:
list = list_1 + list_2
If you don't need to keep list_1 around, you can just modify it:
list_1.extend(list_2)
Solution 2
If you are using python 3, there is one more way to do this and a little bit faster (tested only on python 3.7)
[*list1, *list2]
Benchmark
from timeit import timeit
x = list(range(10000))
y = list(x)
def one():
x + y
def two():
[*x, *y]
print(timeit(one, number=1000, globals={'x':x, 'y': y}))
print(timeit(two, number=1000, globals={'x':x, 'y': y}))
0.10456193100253586
0.09631731400440913
Solution 3
list_1 + list_2
does it. Example -
>>> list_1 = [1,2,3,4]
>>> list_2 = [5,6,7,8]
>>> list_1 + list_2
[1, 2, 3, 4, 5, 6, 7, 8]
Solution 4
I tested out several ways to merge two lists (see below) and came up with the following order after running each several times to normalize the cache changes (which make about a 15% difference).
import time
c = list(range(1,10000000))
c_n = list(range(10000000, 20000000))
start = time.time()
*insert method here*
print (time.time()-start)
Method 1:
c.extend(c_n)
- Representative result: 0.11861872673034668
Method 2:
c += c_n
- Representative result: 0.10558319091796875
Method 3:
c = c + c_n
- Representative result: 0.25804924964904785
Method 4:
c = [*c, *c_n]
- Representative result: 0.22019600868225098
Conclusion
Use +=
or .extend()
if you want to merge in place. They are significantly faster.
Naffi
Updated on June 14, 2021Comments
-
Naffi almost 3 years
Given,
list_1 = [1,2,3,4] list_2 = [5,6,7,8]
What is the fastest way to achieve the following in python?
list = [1,2,3,4,5,6,7,8]
Please note that there can be many ways to merge two lists in python.
I am looking for the most time-efficient way.I tried the following and here is my understanding.
CODE
import time c = list(range(1,10000000)) c_n = list(range(10000000, 20000000)) start = time.time() c = c+c_n print len(c) print time.time() - start c = list(range(1,10000000)) start = time.time() for i in c_n: c.append(i) print len(c) print time.time() - start c = list(range(1,10000000)) start = time.time() c.extend(c_n) print len(c) print time.time() - start
OUTPUT
19999999 0.125061035156 19999999 1.02858018875 19999999 0.03928399086
So, if someone does not bother reusing list_1/list_2 in the question then extend is the way to go. On the other hand, "+" is the fastest way.
I am not sure about other options though.
-
hepcat72 almost 8 yearsWhat happens if list_1 is empty (or if list_2 is empty or both)?
-
phant0m almost 8 years@hepcat72 Nothing special, it will produce the correct result.
-
hepcat72 almost 8 yearsWhat's the correct result? If list_1 is empty and list_2 contained a single string "a", would list contain [undef,"a"] or just ["a"]?
-
phant0m almost 8 years
undef
? There is no such thing.[] + ["a"] == ["a"]
. If you add a list with zero items and a list with one item, you'll end up with a list containing one item, naturally, not two. -
hepcat72 almost 8 yearsI'm a python newb. Thanks for the explanation. Just trying to figure out how python works.
-
phant0m almost 8 yearsNo problem, in the future, just try your own example in the Python console. "Try it and see"
-
Nick stands with Ukraine almost 5 yearsInteresting, I got similar numbers to you for lists of size 10k elements, but when I went up to lists with 100k elements
one
was faster (at least using 3.6) -
Mohit Solanki almost 5 yearsHmm, I just tried with 100k elements, result is the second one is still faster.
-
Nick stands with Ukraine almost 5 yearsI'm on 3.6 though, so that may be related
-
Mohit Solanki almost 5 yearsYeah one: 40.26142632599658 two: 29.216321185995184 this is the result for 1 million items, the second one is significantly faster,
-
Nick stands with Ukraine almost 5 yearsLaptop is falling over trying to run it with 1mil items, will take your word for it :)
-
AJ Gayeta almost 4 yearsWhat if you're trying to merge half of list_1 and half of list_2 in one list called list_3? Will + still work?
-
Almenon over 3 yearsI tested it with python 3.8 and confirmed Kiran's results. Note that method 2 uses INPLACE_ADD bytecode instruction, meaning it operates in place (on the same memory), hence why it is faster than method 3, which uses BINARY_ADD
-
Gathide over 2 yearsWhat is the result when the two lists have a common value?, will the value be repeated in the resulting list?
-
Jonathan about 2 years@Almenon how can I see the bytecode instructions being used?
-
Almenon about 2 years@Jonathan see docs.python.org/3/library/dis.html