Can someone please explain this to me? This doesn\'t make any sense to me.
I copy a dictionary into another and edit the second and both are changed. Why is this hap
In Depth:
Whenever you do dict2 = dict1, dict2 refers to dict1. Both dict1 and dict2 points to the same location in the memory. This is just a normal case while working with mutable objects in python. When you are working with mutable objects in python you must be careful as it is hard to debug.
Instead of using dict2 = dict1, you should be using copy(shallow copy) and deepcopy method from python's copy module to separate dict2 from dict1.
The correct way is:
>>> dict1 = {"key1": "value1", "key2": "value2"}
>>> dict2 = dict1.copy()
>>> dict2
{'key1': 'value1', 'key2': 'value2'}
>>> dict2["key2"] = "WHY?"
>>> dict2
{'key1': 'value1', 'key2': 'WHY?'}
>>> dict1
{'key1': 'value1', 'key2': 'value2'}
>>> id(dict1)
140641178056312
>>> id(dict2)
140641176198960
>>>
As you can see the id of both dict1 and dict2 are different, which means both are pointing/referencing to different locations in the memory.
This solution works for dictionaries with immutable values, this is not the correct solution for those with mutable values.
Eg:
>>> import copy
>>> dict1 = {"key1" : "value1", "key2": {"mutable": True}}
>>> dict2 = dict1.copy()
>>> dict2
{'key1': 'value1', 'key2': {'mutable': True}}
>>> dict2["key2"]["mutable"] = False
>>> dict2
{'key1': 'value1', 'key2': {'mutable': False}}
>>> dict1
{'key1': 'value1', 'key2': {'mutable': False}}
>>> id(dict1)
140641197660704
>>> id(dict2)
140641196407832
>>> id(dict1["key2"])
140641176198960
>>> id(dict2["key2"])
140641176198960
You can see that even though we applied copy for dict1, the value of mutable is changed to false on both dict2 and dict1 even though we only change it on dict2. This is because we changed the value of a mutable dict part of the dict1. When we apply a copy on dict, it will only do a shallow copy which means it copies all the immutable values into a new dict and does not copy the mutable values but it will reference them.
The ultimate solution is to do a deepycopy of dict1 to completely create a new dict with all the values copied, including mutable values.
>>>import copy
>>> dict1 = {"key1" : "value1", "key2": {"mutable": True}}
>>> dict2 = copy.deepcopy(dict1)
>>> dict2
{'key1': 'value1', 'key2': {'mutable': True}}
>>> id(dict1)
140641196228824
>>> id(dict2)
140641197662072
>>> id(dict1["key2"])
140641178056312
>>> id(dict2["key2"])
140641197662000
>>> dict2["key2"]["mutable"] = False
>>> dict2
{'key1': 'value1', 'key2': {'mutable': False}}
>>> dict1
{'key1': 'value1', 'key2': {'mutable': True}}
As you can see, id's are different, it means that dict2 is completely a new dict with all the values in dict1.
Deepcopy needs to be used if whenever you want to change any of the mutable values without affecting the original dict. If not you can use shallow copy. Deepcopy is slow as it works recursively to copy any nested values in the original dict and also takes extra memory.
You can copy and edit the newly constructed copy in one go by calling the dict
constructor with additional keyword arguments:
>>> dict1 = {"key1": "value1", "key2": "value2"}
>>> dict2 = dict(dict1, key2="WHY?!")
>>> dict1
{'key2': 'value2', 'key1': 'value1'}
>>> dict2
{'key2': 'WHY?!', 'key1': 'value1'}
This confused me too, initially, because I was coming from a C background.
In C, a variable is a location in memory with a defined type. Assigning to a variable copies the data into the variable's memory location.
But in Python, variables act more like pointers to objects. So assigning one variable to another doesn't make a copy, it just makes that variable name point to the same object.
While dict.copy()
and dict(dict1)
generates a copy, they are only shallow copies. If you want a deep copy, copy.deepcopy(dict1)
is required. An example:
>>> source = {'a': 1, 'b': {'m': 4, 'n': 5, 'o': 6}, 'c': 3}
>>> copy1 = x.copy()
>>> copy2 = dict(x)
>>> import copy
>>> copy3 = copy.deepcopy(x)
>>> source['a'] = 10 # a change to first-level properties won't affect copies
>>> source
{'a': 10, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
>>> copy1
{'a': 1, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
>>> copy2
{'a': 1, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
>>> copy3
{'a': 1, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
>>> source['b']['m'] = 40 # a change to deep properties WILL affect shallow copies 'b.m' property
>>> source
{'a': 10, 'c': 3, 'b': {'m': 40, 'o': 6, 'n': 5}}
>>> copy1
{'a': 1, 'c': 3, 'b': {'m': 40, 'o': 6, 'n': 5}}
>>> copy2
{'a': 1, 'c': 3, 'b': {'m': 40, 'o': 6, 'n': 5}}
>>> copy3 # Deep copy's 'b.m' property is unaffected
{'a': 1, 'c': 3, 'b': {'m': 4, 'o': 6, 'n': 5}}
Regarding shallow vs deep copies, from the Python copy module docs:
The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):
- A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
- A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
i ran into a peculiar behavior when trying to deep copy dictionary property of class w/o assigning it to variable
new = copy.deepcopy(my_class.a)
doesn't work i.e. modifying new
modifies my_class.a
but if you do old = my_class.a
and then new = copy.deepcopy(old)
it works perfectly i.e. modifying new
does not affect my_class.a
I am not sure why this happens, but hope it helps save some hours! :)
You can use directly:
dict2 = eval(repr(dict1))
where object dict2 is an independent copy of dict1, so you can modify dict2 without affecting dict1.
This works for any kind of object.