问题
I have a class that has __eq__
and __hash__
overridden, to make its objects act as dictionary keys. Each object also carries a dictionary, keyed by other objects of the same class. I get a weird AttributeError
when I try to deepcopy
the whole structure. I am using Python 3.6.0 on OsX.
From Python docs it looks as if deepcopy
uses a memo
dictionary to cache the objects it has already copied, so nested structures should not be a problem. What am I doing wrong then? Should I code up my own __deepcopy__
method to work around this? How?
from copy import deepcopy
class Node:
def __init__(self, p_id):
self.id = p_id
self.edge_dict = {}
self.degree = 0
def __eq__(self, other):
return self.id == other.id
def __hash__(self):
return hash(self.id)
def add_edge(self, p_node, p_data):
if p_node not in self.edge_dict:
self.edge_dict[p_node] = p_data
self.degree += 1
return True
else:
return False
if __name__ == '__main__':
node1 = Node(1)
node2 = Node(2)
node1.add_edge(node2, "1->2")
node2.add_edge(node1, "2->1")
node1_copy = deepcopy(node1)
File ".../node_test.py", line 15, in __hash__
return hash(self.id)
AttributeError: 'Node' object has no attribute 'id'
回答1:
Cyclic dependencies are a problem for deepcopy
when you:
- Have classes that must be hashed and contain reference cycles, and
- Don't ensure hash-related (and equality related) invariants are established at object construction, not just initialization
The problem is unpickling an object (deepcopy
, by default, copies custom objects by pickling and unpickling, unless a special __deepcopy__
method is defined) creates the empty object without initializing it, then tries to fill in its attributes one by one. When it tries to fill in node1
's attributes, it needs to initialize node2
, which in turn relies on the partially created node1
(in both cases due to the edge_dict
). At the time it's trying to fill in the edge_dict
for one Node
, the Node
it's adding to edge_dict
doesn't have its id
attribute set yet, so the attempt to hash it fails.
You can correct this by using __new__
to ensure invariants are established prior to initializing mutable, possibly recursive attributes, and defining the pickle
helper __getnewargs__
(or __getnewargs_ex__
) to make it use them properly. Specifically, change you class definition to:
class Node:
# __new__ instead of __init__ to establish necessary id invariant
# You could use both __new__ and __init__, but that's usually more complicated
# than you really need
def __new__(cls, p_id):
self = super().__new__(cls) # Must explicitly create the new object
# Aside from explicit construction and return, rest of __new__
# is same as __init__
self.id = p_id
self.edge_dict = {}
self.degree = 0
return self # __new__ returns the new object
def __getnewargs__(self):
# Return the arguments that *must* be passed to __new__
return (self.id,)
# ... rest of class is unchanged ...
Note: If this is Python 2 code, make sure to explicitly inherit from object
and change super()
to super(Node, cls)
in __new__
; the code given is the simpler Python 3 code.
An alternate solution that handles only copy.deepcopy
, without supporting pickling or requiring the use of __new__
/__getnewargs__
(which require new-style classes) would be to override deepcopying only. You'd define the following extra method on your original class (and make sure the module imports copy
), and otherwise leave it untouched:
def __deepcopy__(self, memo):
# Deepcopy only the id attribute, then construct the new instance and map
# the id() of the existing copy to the new instance in the memo dictionary
memo[id(self)] = newself = self.__class__(copy.deepcopy(self.id, memo))
# Now that memo is populated with a hashable instance, copy the other attributes:
newself.degree = copy.deepcopy(self.degree, memo)
# Safe to deepcopy edge_dict now, because backreferences to self will
# be remapped to newself automatically
newself.edge_dict = copy.deepcopy(self.edge_dict, memo)
return newself
来源:https://stackoverflow.com/questions/46283738/attributeerror-when-using-python-deepcopy