What is the difference between shallow copy, deepcopy and normal assignment operation?

114,942

Solution 1

Normal assignment operations will simply point the new variable towards the existing object. The docs explain the difference between shallow and deep copies:

The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):

  • A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.

  • A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

Here's a little demonstration:

import copy

a = [1, 2, 3]
b = [4, 5, 6]
c = [a, b]

Using normal assignment operatings to copy:

d = c

print id(c) == id(d)          # True - d is the same object as c
print id(c[0]) == id(d[0])    # True - d[0] is the same object as c[0]

Using a shallow copy:

d = copy.copy(c)

print id(c) == id(d)          # False - d is now a new object
print id(c[0]) == id(d[0])    # True - d[0] is the same object as c[0]

Using a deep copy:

d = copy.deepcopy(c)

print id(c) == id(d)          # False - d is now a new object
print id(c[0]) == id(d[0])    # False - d[0] is now a new object

Solution 2

For immutable objects, there is no need for copying because the data will never change, so Python uses the same data; ids are always the same. For mutable objects, since they can potentially change, [shallow] copy creates a new object.

Deep copy is related to nested structures. If you have list of lists, then deepcopy copies the nested lists also, so it is a recursive copy. With just copy, you have a new outer list, but inner lists are references.

Assignment does not copy. It simply sets the reference to the old data. So you need copy to create a new list with the same contents.

Solution 3

For immutable objects, creating a copy don't make much sense since they are not going to change. For mutable objects assignment,copy and deepcopy behaves differently. Lets talk about each of them with examples.

An assignment operation simply assigns the reference of source to destination e.g:

>>> i = [1,2,3]
>>> j=i
>>> hex(id(i)), hex(id(j))
>>> ('0x10296f908', '0x10296f908') #Both addresses are identical

Now i and j technically refers to same list. Both i and j have same memory address. Any updation to either of them will be reflected to the other. e.g:

>>> i.append(4)
>>> j
>>> [1,2,3,4] #Destination is updated

>>> j.append(5)
>>> i
>>> [1,2,3,4,5] #Source is updated

On the other hand copy and deepcopy creates a new copy of variable. So now changes to original variable will not be reflected to the copy variable and vice versa. However copy(shallow copy), don't creates a copy of nested objects, instead it just copies the reference of nested objects. Deepcopy copies all the nested objects recursively.

Some examples to demonstrate behaviour of copy and deepcopy:

Flat list example using copy:

>>> import copy
>>> i = [1,2,3]
>>> j = copy.copy(i)
>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are different

>>> i.append(4)
>>> j
>>> [1,2,3] #Updation of original list didn't affected copied variable

Nested list example using copy:

>>> import copy
>>> i = [1,2,3,[4,5]]
>>> j = copy.copy(i)

>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are still different

>>> hex(id(i[3])), hex(id(j[3]))
>>> ('0x10296f908', '0x10296f908') #Nested lists have same address

>>> i[3].append(6)
>>> j
>>> [1,2,3,[4,5,6]] #Updation of original nested list updated the copy as well

Flat list example using deepcopy:

>>> import copy
>>> i = [1,2,3]
>>> j = copy.deepcopy(i)
>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are different

>>> i.append(4)
>>> j
>>> [1,2,3] #Updation of original list didn't affected copied variable

Nested list example using deepcopy:

>>> import copy
>>> i = [1,2,3,[4,5]]
>>> j = copy.deepcopy(i)

>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are still different

>>> hex(id(i[3])), hex(id(j[3]))
>>> ('0x10296f908', '0x102b9b7c8') #Nested lists have different addresses

>>> i[3].append(6)
>>> j
>>> [1,2,3,[4,5]] #Updation of original nested list didn't affected the copied variable    

Solution 4

Let's see in a graphical example how the following code is executed:

import copy

class Foo(object):
    def __init__(self):
        pass


a = [Foo(), Foo()]
shallow = copy.copy(a)
deep = copy.deepcopy(a)

enter image description here

Solution 5

a, b, c, d, a1, b1, c1 and d1 are references to objects in memory, which are uniquely identified by their ids.

An assignment operation takes a reference to the object in memory and assigns that reference to a new name. c=[1,2,3,4] is an assignment that creates a new list object containing those four integers, and assigns the reference to that object to c. c1=c is an assignment that takes the same reference to the same object and assigns that to c1. Since the list is mutable, anything that happens to that list will be visible regardless of whether you access it through c or c1, because they both reference the same object.

c1=copy.copy(c) is a "shallow copy" that creates a new list and assigns the reference to the new list to c1. c still points to the original list. So, if you modify the list at c1, the list that c refers to will not change.

The concept of copying is irrelevant to immutable objects like integers and strings. Since you can't modify those objects, there is never a need to have two copies of the same value in memory at different locations. So integers and strings, and some other objects to which the concept of copying does not apply, are simply reassigned. This is why your examples with a and b result in identical ids.

c1=copy.deepcopy(c) is a "deep copy", but it functions the same as a shallow copy in this example. Deep copies differ from shallow copies in that shallow copies will make a new copy of the object itself, but any references inside that object will not themselves be copied. In your example, your list has only integers inside it (which are immutable), and as previously discussed there is no need to copy those. So the "deep" part of the deep copy does not apply. However, consider this more complex list:

e = [[1, 2],[4, 5, 6],[7, 8, 9]]

This is a list that contains other lists (you could also describe it as a two-dimensional array).

If you run a "shallow copy" on e, copying it to e1, you will find that the id of the list changes, but each copy of the list contains references to the same three lists -- the lists with integers inside. That means that if you were to do e[0].append(3), then e would be [[1, 2, 3],[4, 5, 6],[7, 8, 9]]. But e1 would also be [[1, 2, 3],[4, 5, 6],[7, 8, 9]]. On the other hand, if you subsequently did e.append([10, 11, 12]), e would be [[1, 2, 3],[4, 5, 6],[7, 8, 9],[10, 11, 12]]. But e1 would still be [[1, 2, 3],[4, 5, 6],[7, 8, 9]]. That's because the outer lists are separate objects that initially each contain three references to three inner lists. If you modify the inner lists, you can see those changes no matter if you are viewing them through one copy or the other. But if you modify one of the outer lists as above, then e contains three references to the original three lists plus one more reference to a new list. And e1 still only contains the original three references.

A 'deep copy' would not only duplicate the outer list, but it would also go inside the lists and duplicate the inner lists, so that the two resulting objects do not contain any of the same references (as far as mutable objects are concerned). If the inner lists had further lists (or other objects such as dictionaries) inside of them, they too would be duplicated. That's the 'deep' part of the 'deep copy'.

Share:
114,942

Related videos on Youtube

deeshank
Author by

deeshank

learn, code, make, think !

Updated on October 20, 2021

Comments

  • deeshank
    deeshank over 2 years
    import copy
    
    a = "deepak"
    b = 1, 2, 3, 4
    c = [1, 2, 3, 4]
    d = {1: 10, 2: 20, 3: 30}
    
    a1 = copy.copy(a)
    b1 = copy.copy(b)
    c1 = copy.copy(c)
    d1 = copy.copy(d)
    
    
    print("immutable - id(a)==id(a1)", id(a) == id(a1))
    print("immutable - id(b)==id(b1)", id(b) == id(b1))
    print("mutable - id(c)==id(c1)", id(c) == id(c1))
    print("mutable - id(d)==id(d1)", id(d) == id(d1))
    

    I get the following results:

    immutable - id(a)==id(a1) True
    immutable - id(b)==id(b1) True
    mutable - id(c)==id(c1) False
    mutable - id(d)==id(d1) False
    

    If I perform deepcopy:

    a1 = copy.deepcopy(a)
    b1 = copy.deepcopy(b)
    c1 = copy.deepcopy(c)
    d1 = copy.deepcopy(d)
    

    results are the same:

    immutable - id(a)==id(a1) True
    immutable - id(b)==id(b1) True
    mutable - id(c)==id(c1) False
    mutable - id(d)==id(d1) False
    

    If I work on assignment operations:

    a1 = a
    b1 = b
    c1 = c
    d1 = d
    

    then results are:

    immutable - id(a)==id(a1) True
    immutable - id(b)==id(b1) True
    mutable - id(c)==id(c1) True
    mutable - id(d)==id(d1) True
    

    Can somebody explain what exactly makes a difference between the copies? Is it something related to mutable & immutable objects? If so, can you please explain it to me?

  • deeshank
    deeshank almost 11 years
    is assginment is same as shallow copy?
  • grc
    grc almost 11 years
    @Dshank No. A shallow copy constructs a new object, while an assignment will simply point the new variable at the existing object. Any changes to the existing object will affect both variables (with assignment).
  • Neerav
    Neerav about 10 years
    @grc "Any changes to the existing object will affect both variables (with assignment)" - this statement is true only for mutable objects and not immutable types like string, float, tuples.
  • Alston
    Alston over 7 years
    @grc A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original. Do you mean that as I do shallow copy to mutable object, the new object might be influenced by the original copied one?
  • Alston
    Alston over 7 years
    With just copy, you have a new outer list but inner lists are references. For the inner lists, would the copied one influenced by original one? I create a list of lists like list_=[[1,2],[3,4]] newlist = list_.copy() list_[0]=[7,8] and the newlist remains the same, so does the inner list are references?
  • grc
    grc over 7 years
    @Stallman If you shallow copy a compound object containing mutable objects, then the mutable objects only exist once. All changes to the mutable objects will be reflected in both the original compound object and the copied compound object.
  • Alston
    Alston over 7 years
    @grc But I have tried an example(I remove the new line here.) list_=[[1,2],[3,4]] newlist = list_.copy() list_[0]=[7,8] print(list_) print(newlist) The newlist still display [[1, 2], [3, 4]]. But list_[0] is a list which is mutable.
  • grc
    grc over 7 years
    @Stallman list_[0] is mutable, but you are not mutating/modifying it. Try list_[0].append(9) or list_[0][0] = 7 instead.
  • Alston
    Alston over 7 years
    @grc Oh, what I do is to change the reference to another list [7,8], but the real modification should change the element at the address list_[0]. Is this opinion correct?
  • perreal
    perreal over 7 years
    @Stallman you are not changing the referenced list here, just creating a new list and assigning it as the first item of one of the copies. try doing list_[0][0] = 7
  • Lee88
    Lee88 about 7 years
    @grc This has been a very helpful discussion, but I have a more "philosophical" question. I think it is fair to say most novice programmers intuitively expect = to accomplish deep copying. Could you shed light on why shallow copying has value? What situation would we not want deep copies? (And forgive me if my assumption that deep copying is what most people want most of the time is wrong)
  • Georgy
    Georgy over 4 years
    a is not a deepcopy of lst!
  • InfiniteLooper
    InfiniteLooper over 3 years
    To be clear, would it be better to describe deep copies as follows : A deep copy constructs a new compound object and then, recursively, inserts DEEP copies into it of the objects found in the original. ?
  • Union find
    Union find over 3 years
    how did you generate this?
  • user1767754
    user1767754 over 3 years
  • Tony Montana
    Tony Montana over 3 years
    I think one thing you got it wrong that even you deepcopy the object, still the child element has the same id. You may find more info here: pythontutor.com/…
  • Gino Mempin
    Gino Mempin about 3 years
    It's already mentioned and explained in the other answers.
  • user2357112
    user2357112 about 3 years
    @Neerav: It's true for immutables as well. Any changes to an immutable object will show up through both variables, because you can't change an immutable object - the statement is vacuously true for immutables.