How do I clone a list so that it doesn't change unexpectedly after assignment?
Solution 1
With new_list = my_list
, you don't actually have two lists. The assignment just copies the reference to the list, not the actual list, so both new_list
and my_list
refer to the same list after the assignment.
To actually copy the list, you have various possibilities:
-
You can use the builtin
list.copy()
method (available since Python 3.3):new_list = old_list.copy()
-
You can slice it:
new_list = old_list[:]
Alex Martelli's opinion (at least back in 2007) about this is, that it is a weird syntax and it does not make sense to use it ever. ;) (In his opinion, the next one is more readable).
-
You can use the built in
list()
function:new_list = list(old_list)
-
You can use generic
copy.copy()
:import copy new_list = copy.copy(old_list)
This is a little slower than
list()
because it has to find out the datatype ofold_list
first. -
If the list contains objects and you want to copy them as well, use generic
copy.deepcopy()
:import copy new_list = copy.deepcopy(old_list)
Obviously the slowest and most memory-needing method, but sometimes unavoidable.
Example:
import copy
class Foo(object):
def __init__(self, val):
self.val = val
def __repr__(self):
return 'Foo({!r})'.format(self.val)
foo = Foo(1)
a = ['foo', foo]
b = a.copy()
c = a[:]
d = list(a)
e = copy.copy(a)
f = copy.deepcopy(a)
# edit orignal list and instance
a.append('baz')
foo.val = 5
print('original: %r\nlist.copy(): %r\nslice: %r\nlist(): %r\ncopy: %r\ndeepcopy: %r'
% (a, b, c, d, e, f))
Result:
original: ['foo', Foo(5), 'baz']
list.copy(): ['foo', Foo(5)]
slice: ['foo', Foo(5)]
list(): ['foo', Foo(5)]
copy: ['foo', Foo(5)]
deepcopy: ['foo', Foo(1)]
Solution 2
Felix already provided an excellent answer, but I thought I'd do a speed comparison of the various methods:
- 10.59 sec (105.9 µs/itn) -
copy.deepcopy(old_list)
- 10.16 sec (101.6 µs/itn) - pure Python
Copy()
method copying classes with deepcopy - 1.488 sec (14.88 µs/itn) - pure Python
Copy()
method not copying classes (only dicts/lists/tuples) - 0.325 sec (3.25 µs/itn) -
for item in old_list: new_list.append(item)
- 0.217 sec (2.17 µs/itn) -
[i for i in old_list]
(a list comprehension) - 0.186 sec (1.86 µs/itn) -
copy.copy(old_list)
- 0.075 sec (0.75 µs/itn) -
list(old_list)
- 0.053 sec (0.53 µs/itn) -
new_list = []; new_list.extend(old_list)
- 0.039 sec (0.39 µs/itn) -
old_list[:]
(list slicing)
So the fastest is list slicing. But be aware that copy.copy()
, list[:]
and list(list)
, unlike copy.deepcopy()
and the python version don't copy any lists, dictionaries and class instances in the list, so if the originals change, they will change in the copied list too and vice versa.
(Here's the script if anyone's interested or wants to raise any issues:)
from copy import deepcopy
class old_class:
def __init__(self):
self.blah = 'blah'
class new_class(object):
def __init__(self):
self.blah = 'blah'
dignore = {str: None, unicode: None, int: None, type(None): None}
def Copy(obj, use_deepcopy=True):
t = type(obj)
if t in (list, tuple):
if t == tuple:
# Convert to a list if a tuple to
# allow assigning to when copying
is_tuple = True
obj = list(obj)
else:
# Otherwise just do a quick slice copy
obj = obj[:]
is_tuple = False
# Copy each item recursively
for x in xrange(len(obj)):
if type(obj[x]) in dignore:
continue
obj[x] = Copy(obj[x], use_deepcopy)
if is_tuple:
# Convert back into a tuple again
obj = tuple(obj)
elif t == dict:
# Use the fast shallow dict copy() method and copy any
# values which aren't immutable (like lists, dicts etc)
obj = obj.copy()
for k in obj:
if type(obj[k]) in dignore:
continue
obj[k] = Copy(obj[k], use_deepcopy)
elif t in dignore:
# Numeric or string/unicode?
# It's immutable, so ignore it!
pass
elif use_deepcopy:
obj = deepcopy(obj)
return obj
if __name__ == '__main__':
import copy
from time import time
num_times = 100000
L = [None, 'blah', 1, 543.4532,
['foo'], ('bar',), {'blah': 'blah'},
old_class(), new_class()]
t = time()
for i in xrange(num_times):
Copy(L)
print 'Custom Copy:', time()-t
t = time()
for i in xrange(num_times):
Copy(L, use_deepcopy=False)
print 'Custom Copy Only Copying Lists/Tuples/Dicts (no classes):', time()-t
t = time()
for i in xrange(num_times):
copy.copy(L)
print 'copy.copy:', time()-t
t = time()
for i in xrange(num_times):
copy.deepcopy(L)
print 'copy.deepcopy:', time()-t
t = time()
for i in xrange(num_times):
L[:]
print 'list slicing [:]:', time()-t
t = time()
for i in xrange(num_times):
list(L)
print 'list(L):', time()-t
t = time()
for i in xrange(num_times):
[i for i in L]
print 'list expression(L):', time()-t
t = time()
for i in xrange(num_times):
a = []
a.extend(L)
print 'list extend:', time()-t
t = time()
for i in xrange(num_times):
a = []
for y in L:
a.append(y)
print 'list append:', time()-t
t = time()
for i in xrange(num_times):
a = []
a.extend(i for i in L)
print 'generator expression extend:', time()-t
Solution 3
I've been told that Python 3.3+ adds the list.copy()
method, which should be as fast as slicing:
newlist = old_list.copy()
Solution 4
What are the options to clone or copy a list in Python?
In Python 3, a shallow copy can be made with:
a_copy = a_list.copy()
In Python 2 and 3, you can get a shallow copy with a full slice of the original:
a_copy = a_list[:]
Explanation
There are two semantic ways to copy a list. A shallow copy creates a new list of the same objects, a deep copy creates a new list containing new equivalent objects.
Shallow list copy
A shallow copy only copies the list itself, which is a container of references to the objects in the list. If the objects contained themselves are mutable and one is changed, the change will be reflected in both lists.
There are different ways to do this in Python 2 and 3. The Python 2 ways will also work in Python 3.
Python 2
In Python 2, the idiomatic way of making a shallow copy of a list is with a complete slice of the original:
a_copy = a_list[:]
You can also accomplish the same thing by passing the list through the list constructor,
a_copy = list(a_list)
but using the constructor is less efficient:
>>> timeit
>>> l = range(20)
>>> min(timeit.repeat(lambda: l[:]))
0.30504298210144043
>>> min(timeit.repeat(lambda: list(l)))
0.40698814392089844
Python 3
In Python 3, lists get the list.copy
method:
a_copy = a_list.copy()
In Python 3.5:
>>> import timeit
>>> l = list(range(20))
>>> min(timeit.repeat(lambda: l[:]))
0.38448613602668047
>>> min(timeit.repeat(lambda: list(l)))
0.6309100328944623
>>> min(timeit.repeat(lambda: l.copy()))
0.38122922903858125
Making another pointer does not make a copy
Using new_list = my_list then modifies new_list every time my_list changes. Why is this?
my_list
is just a name that points to the actual list in memory. When you say new_list = my_list
you're not making a copy, you're just adding another name that points at that original list in memory. We can have similar issues when we make copies of lists.
>>> l = [[], [], []]
>>> l_copy = l[:]
>>> l_copy
[[], [], []]
>>> l_copy[0].append('foo')
>>> l_copy
[['foo'], [], []]
>>> l
[['foo'], [], []]
The list is just an array of pointers to the contents, so a shallow copy just copies the pointers, and so you have two different lists, but they have the same contents. To make copies of the contents, you need a deep copy.
Deep copies
To make a deep copy of a list, in Python 2 or 3, use deepcopy
in the copy
module:
import copy
a_deep_copy = copy.deepcopy(a_list)
To demonstrate how this allows us to make new sub-lists:
>>> import copy
>>> l
[['foo'], [], []]
>>> l_deep_copy = copy.deepcopy(l)
>>> l_deep_copy[0].pop()
'foo'
>>> l_deep_copy
[[], [], []]
>>> l
[['foo'], [], []]
And so we see that the deep copied list is an entirely different list from the original. You could roll your own function - but don't. You're likely to create bugs you otherwise wouldn't have by using the standard library's deepcopy function.
Don't use eval
You may see this used as a way to deepcopy, but don't do it:
problematic_deep_copy = eval(repr(a_list))
- It's dangerous, particularly if you're evaluating something from a source you don't trust.
- It's not reliable, if a subelement you're copying doesn't have a representation that can be eval'd to reproduce an equivalent element.
- It's also less performant.
In 64 bit Python 2.7:
>>> import timeit
>>> import copy
>>> l = range(10)
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
27.55826997756958
>>> min(timeit.repeat(lambda: eval(repr(l))))
29.04534101486206
on 64 bit Python 3.5:
>>> import timeit
>>> import copy
>>> l = list(range(10))
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
16.84255409205798
>>> min(timeit.repeat(lambda: eval(repr(l))))
34.813894678023644
Solution 5
Let's start from the beginning and explore this question.
So let's suppose you have two lists:
list_1 = ['01', '98']
list_2 = [['01', '98']]
And we have to copy both lists, now starting from the first list:
So first let's try by setting the variable copy
to our original list, list_1
:
copy = list_1
Now if you are thinking copy copied the list_1, then you are wrong. The id
function can show us if two variables can point to the same object. Let's try this:
print(id(copy))
print(id(list_1))
The output is:
4329485320
4329485320
Both variables are the exact same argument. Are you surprised?
So as we know, Python doesn't store anything in a variable, Variables are just referencing to the object and object store the value. Here object is a list
but we created two references to that same object by two different variable names. This means that both variables are pointing to the same object, just with different names.
When you do copy = list_1
, it is actually doing:
Here in the image list_1 and copy are two variable names, but the object is same for both variable which is list
.
So if you try to modify copied list then it will modify the original list too because the list is only one there, you will modify that list no matter you do from the copied list or from the original list:
copy[0] = "modify"
print(copy)
print(list_1)
Output:
['modify', '98']
['modify', '98']
So it modified the original list:
Now let's move onto a Pythonic method for copying lists.
copy_1 = list_1[:]
This method fixes the first issue we had:
print(id(copy_1))
print(id(list_1))
4338792136
4338791432
So as we can see our both list having different id and it means that both variables are pointing to different objects. So what actually going on here is:
Now let's try to modify the list and let's see if we still face the previous problem:
copy_1[0] = "modify"
print(list_1)
print(copy_1)
The output is:
['01', '98']
['modify', '98']
As you can see, it only modified the copied list. That means it worked.
Do you think we're done? No. Let's try to copy our nested list.
copy_2 = list_2[:]
list_2
should reference to another object which is copy of list_2
. Let's check:
print(id((list_2)), id(copy_2))
We get the output:
4330403592 4330403528
Now we can assume both lists are pointing different object, so now let's try to modify it and let's see it is giving what we want:
copy_2[0][1] = "modify"
print(list_2, copy_2)
This gives us the output:
[['01', 'modify']] [['01', 'modify']]
This may seem a little bit confusing, because the same method we previously used worked. Let's try to understand this.
When you do:
copy_2 = list_2[:]
You're only copying the outer list, not the inside list. We can use the id
function once again to check this.
print(id(copy_2[0]))
print(id(list_2[0]))
The output is:
4329485832
4329485832
When we do copy_2 = list_2[:]
, this happens:
It creates the copy of list, but only outer list copy, not the nested list copy. The nested list is same for both variable, so if you try to modify the nested list then it will modify the original list too as the nested list object is same for both lists.
What is the solution? The solution is the deepcopy
function.
from copy import deepcopy
deep = deepcopy(list_2)
Let's check this:
print(id((list_2)), id(deep))
4322146056 4322148040
Both outer lists have different IDs. Let's try this on the inner nested lists.
print(id(deep[0]))
print(id(list_2[0]))
The output is:
4322145992
4322145800
As you can see both IDs are different, meaning we can assume that both nested lists are pointing different object now.
This means when you do deep = deepcopy(list_2)
what actually happens:
Both nested lists are pointing different object and they have separate copy of nested list now.
Now let's try to modify the nested list and see if it solved the previous issue or not:
deep[0][1] = "modify"
print(list_2, deep)
It outputs:
[['01', '98']] [['01', 'modify']]
As you can see, it didn't modify the original nested list, it only modified the copied list.
Comments
-
aF. almost 2 years
While using
new_list = my_list
, any modifications tonew_list
changesmy_list
every time. Why is this, and how can I clone or copy the list to prevent it?-
Andrew over 2 years
new_list = my_list
just assigns the namenew_list
to the objectmy_list
refers to. -
Bharel over 2 yearsSee the Python FAQ.
-
-
PM 2Ring almost 9 yearsThis won't always work, since there's no guarantee that the string returned by
repr()
is sufficient to re-create the object. Also,eval()
is a tool of last resort; see Eval really is dangerous by SO veteran Ned Batchelder for details. So when you advocate the useeval()
you really should mention that it can be dangerous. -
AMR almost 9 yearsFair point. Though I think that Batchelder's point is that the having the
eval()
function in Python in general is a risk. It isn't so much whether or not you make use of the function in code but that it is a security hole in Python in and of itself. My example isn't using it with a function that receives input frominput()
,sys.agrv
, or even a text file. It is more along the lines of initializing a blank multidimensional list once, and then just having a way of copying it in a loop instead of reinitializing at each iteration of the loop. -
AMR almost 9 yearsAs @AaronHall has pointed out, there is likely a significant performance issue to using
new_list = eval(repr(old_list))
, so besides it being a bad idea, it probably is also way too slow to work. -
not2qubit over 5 yearsHow does this method behave when modifying copies?
-
SCB over 5 years@not2qubit do you mean appending to or editing elements of the new list. In the example
old_list
andnew_list
are two different lists, editing one will not change the other (unless you’re directly mutating the elements themselves (such as list of list), none-of these methods are deep copies). -
CyberMew over 5 yearsYes, and as per docs docs.python.org/3/library/stdtypes.html#mutable-sequence-types,
s.copy()
creates a shallow copy ofs
(same ass[:]
). -
John Locke over 5 yearsYou don't need a deepcopy if the list is 2D. If it is a list of lists, and those lists don't have lists inside of them, you can use a for loop. Presently, I am using
list_copy=[]
for item in list: list_copy.append(copy(item))
and it is much faster. -
SuperShoot about 4 yearsCan confirm still a similar story on 3.8
b=[*a]
- the one obvious way to do it;). -
loved.by.Jesus about 4 yearsActually it seems that currently,
python3.8
,.copy()
is slightly faster than slicing. See below @AaronsHall answer. -
Jean-François Fabre over 3 yearsdeepcopy must be used only when needed and one should be aware of what it really does.
-
ekhumoro over 3 yearsSome of these timing comparisons aren't particularly meaningful when copying such tiny lists. It would be more informative to test with a range of list lengths (including some very large ones).
-
ShadowRanger over 3 years@loved.by.Jesus: Yeah, they added optimizations for Python level method calls in 3.7 that were extended to C extension method calls in 3.8 by PEP 590 that remove the overhead of creating a bound method each time you call a method, so the cost to call
alist.copy()
is now adict
lookup on thelist
type, then a relatively cheap no-arg function call that ultimately invokes the same thing as slicing. Slicing still has to build aslice
object, then go through type checks and unpacking to do the same thing. -
ShadowRanger over 3 yearsOf course, they're working on optimizing out the repeated builds of constant slices, so in 3.10 slicing might win again. It's all pretty meaningless though; the asymptotic performance is identical, and the fixed overhead relatively small, so it doesn't really matter which approach you use.
-
moojen over 3 yearsAs @Georgy points out correctly in the answer below, any changes to the new_list values will also change the values in my_list. So actually the copy.deepcopy() method is the only real copy without reference to the original list and it's values.
-
moojen over 3 yearsYou're right, it was edited by you, but posted by @cryo Sorry for the mixup!
-
uuu777 about 3 yearsWhich one is fastest?
-
uuu777 about 3 yearsDoes it mean that append and list comprehension are the best options?
-
uuu777 about 3 yearsI have a cache containing a list of classes, I want to take lock, copy out the list, release lock. I hope that it is enough to use built-in copy to protect copied out list from changing when cached copy is changed.
-
Tom almost 3 yearsI was having the same issue with a list of json (each element of a list was a json) and the only one that worked was new_list = copy.deepcopy(old_list) ; I'm writing this since anyone can encounter the same issue. Thanks!
-
Peter Mortensen almost 3 yearsThe timing numbers ought to rounded to the appropriate number of significant digits. 15 significant digits do not make any sense.
-
River almost 3 yearsI've essentially just pasted the raw output of the timing code here. Seems like your gripe is more about how timeit displays timings, which I have little control over.
-
Raf over 2 years+1 for slicing
[:]
it is a simple and compact syntax and it does make sense to use it every time you need to copy a list and can avoid adeepcopy
-
Klim Yadrintsev over 2 yearsI keep on coming back to this answer to make sure that I am using the most efficient method. What is the easiest way to test this? Or is there a database with all of the best ways to minimise run time?
-
2pichar about 2 yearsTechnically,
my_list[:]
is a shallow copy. The only way to deepcopy a list is usingcopy.deepcopy()
-
Asaga almost 2 years+4 for slicing [:] it is simple, compact, faster, and more powerful: one can copy a slice of a list [start:stop], which other methods cannot do.