The +=
operator in python seems to be operating unexpectedly on lists. Can anyone tell me what is going on here?
class foo:
bar = []
>>> elements=[[1],[2],[3]]
>>> subset=[]
>>> subset+=elements[0:1]
>>> subset
[[1]]
>>> elements
[[1], [2], [3]]
>>> subset[0][0]='change'
>>> elements
[['change'], [2], [3]]
>>> a=[1,2,3,4]
>>> b=a
>>> a+=[5]
>>> a,b
([1, 2, 3, 4, 5], [1, 2, 3, 4, 5])
>>> a=[1,2,3,4]
>>> b=a
>>> a=a+[5]
>>> a,b
([1, 2, 3, 4, 5], [1, 2, 3, 4])
The problem here is, bar
is defined as a class attribute, not an instance variable.
In foo
, the class attribute is modified in the init
method, that's why all instances are affected.
In foo2
, an instance variable is defined using the (empty) class attribute, and every instance gets its own bar
.
The "correct" implementation would be:
class foo:
def __init__(self, x):
self.bar = [x]
Of course, class attributes are completely legal. In fact, you can access and modify them without creating an instance of the class like this:
class foo:
bar = []
foo.bar = [x]
The other answers would seem to pretty much have it covered, though it seems worth quoting and referring to the Augmented Assignments PEP 203:
They [the augmented assignment operators] implement the same operator as their normal binary form, except that the operation is done `in-place' when the left-hand side object supports it, and that the left-hand side is only evaluated once.
...
The idea behind augmented assignment in Python is that it isn't just an easier way to write the common practice of storing the result of a binary operation in its left-hand operand, but also a way for the left-hand operand in question to know that it should operate `on itself', rather than creating a modified copy of itself.
Although much time has passed and many correct things were said, there is no answer which bundles both effects.
You have 2 effects:
+=
(as stated by Scott Griffiths)In class foo
, the __init__
method modifies the class attribute. It is because self.bar += [x]
translates to self.bar = self.bar.__iadd__([x])
. __iadd__()
is for inplace modification, so it modifies the list and returns a reference to it.
Note that the instance dict is modified although this would normally not be necessary as the class dict already contains the same assignment. So this detail goes almost unnoticed - except if you do a foo.bar = []
afterwards. Here the instances's bar
stays the same thanks to the said fact.
In class foo2
, however, the class's bar
is used, but not touched. Instead, a [x]
is added to it, forming a new object, as self.bar.__add__([x])
is called here, which doesn't modify the object. The result is put into the instance dict then, giving the instance the new list as a dict, while the class's attribute stays modified.
The distinction between ... = ... + ...
and ... += ...
affects as well the assignments afterwards:
f = foo(1) # adds 1 to the class's bar and assigns f.bar to this as well.
g = foo(2) # adds 2 to the class's bar and assigns g.bar to this as well.
# Here, foo.bar, f.bar and g.bar refer to the same object.
print f.bar # [1, 2]
print g.bar # [1, 2]
f.bar += [3] # adds 3 to this object
print f.bar # As these still refer to the same object,
print g.bar # the output is the same.
f.bar = f.bar + [4] # Construct a new list with the values of the old ones, 4 appended.
print f.bar # Print the new one
print g.bar # Print the old one.
f = foo2(1) # Here a new list is created on every call.
g = foo2(2)
print f.bar # So these all obly have one element.
print g.bar
You can verify the identity of the objects with print id(foo), id(f), id(g)
(don't forget the additional ()
s if you are on Python3).
BTW: The +=
operator is called "augmented assignment" and generally is intended to do inplace modifications as far as possible.
>>> a = 89
>>> id(a)
4434330504
>>> a = 89 + 1
>>> print(a)
90
>>> id(a)
4430689552 # this is different from before!
>>> test = [1, 2, 3]
>>> id(test)
48638344L
>>> test2 = test
>>> id(test)
48638344L
>>> test2 += [4]
>>> id(test)
48638344L
>>> print(test, test2) # [1, 2, 3, 4] [1, 2, 3, 4]```
([1, 2, 3, 4], [1, 2, 3, 4])
>>> id(test2)
48638344L # ID is different here
We see that when we attempt to modify an immutable object (integer in this case), Python simply gives us a different object instead. On the other hand, we are able to make changes to an mutable object (a list) and have it remain the same object throughout.
ref : https://medium.com/@tyastropheus/tricky-python-i-memory-management-for-mutable-immutable-objects-21507d1e5b95
Also refer below url to understand the shallowcopy and deepcopy
https://www.geeksforgeeks.org/copy-python-deep-copy-shallow-copy/
There are two things involved here:
1. class attributes and instance attributes
2. difference between the operators + and += for lists
+
operator calls the __add__
method on a list. It takes all the elements from its operands and makes a new list containing those elements maintaining their order.
+=
operator calls __iadd__
method on the list. It takes an iterable and appends all the elements of the iterable to the list in place. It does not create a new list object.
In class foo
the statement self.bar += [x]
is not an assignment statement but actually translates to
self.bar.__iadd__([x]) # modifies the class attribute
which modifies the list in place and acts like the list method extend
.
In class foo2
, on the contrary, the assignment statement in the init
method
self.bar = self.bar + [x]
can be deconstructed as:
The instance has no attribute bar
(there is a class attribute of the same name, though) so it accesses the class attribute bar
and creates a new list by appending x
to it. The statement translates to:
self.bar = self.bar.__add__([x]) # bar on the lhs is the class attribute
Then it creates an instance attribute bar
and assigns the newly created list to it. Note that bar
on the rhs of the assignment is different from the bar
on the lhs.
For instances of class foo
, bar
is a class attribute and not instance attribute. Hence any change to the class attribute bar
will be reflected for all instances.
On the contrary, each instance of the class foo2
has its own instance attribute bar
which is different from the class attribute of the same name bar
.
f = foo2(4)
print f.bar # accessing the instance attribute. prints [4]
print f.__class__.bar # accessing the class attribute. prints []
Hope this clears things.