How to provide additional initialization for a subclass of namedtuple?

前端未结

关注

 3  2038

Suppose I have a namedtuple like this:

EdgeBase = namedtuple(\"EdgeBase\", \"left, right\")

I want to implement a custom hash-

相关标签:

3条回答

春和景丽

2020-12-23 17:50
In Python 3.7+, you can now use dataclasses to build hashable classes with ease.

Code

Assuming int types of left and right, we use the default hashing via unsafe_hash⁺ keyword:
```
import dataclasses as dc


@dc.dataclass(unsafe_hash=True)
class Edge:
    left: int
    right: int


hash(Edge(1, 2))
# 3713081631934410656
```
Now we can use these (mutable) hashable objects as elements in a set or (keys in a dict).
```
{Edge(1, 2), Edge(1, 2), Edge(2, 1), Edge(2, 3)}
# {Edge(left=1, right=2), Edge(left=2, right=1), Edge(left=2, right=3)}
```
Details

We can alternatively override the __hash__ function:
```
@dc.dataclass
class Edge:
    left: int
    right: int

    def __post_init__(self):
        # Add custom hashing function here
        self._hash = hash((self.left, self.right))         # emulates default

    def __hash__(self):
        return self._hash


hash(Edge(1, 2))
# 3713081631934410656
```
Expanding on @ShadowRanger's comment, the OP's custom hash function is not reliable. In particular, the attribute values can be interchanged, e.g. hash(Edge(1, 2)) == hash(Edge(2, 1)), which is likely unintended.

_{⁺Note, the name "unsafe" suggests the default hash will be used despite being a mutable object. This may be undesired, particularly within a dict expecting immutable keys. Immutable hashing can be turned on with the appropriate keywords. See also more on hashing logic in dataclasses and a related issue.}
0 讨论(0)
发布评论:

提交评论
- 加载中...
醉话见心

2020-12-23 17:55
The code in the question could benefit from a super call in the __init__ in case it ever gets subclassed in a multiple inheritance situation, but otherwise is correct.
```
class Edge(EdgeBase):
    def __init__(self, left, right):
        super(Edge, self).__init__(left, right)
        self._hash = hash(self.left) * hash(self.right)

    def __hash__(self):
        return self._hash
```
While tuples are readonly only the tuple parts of their subclasses are readonly, other properties may be written as usual which is what allows the assignment to _hash regardless of whether it's done in __init__ or __new__. You can make the subclass fully readonly by setting it's __slots__ to (), which has the added benefit of saving memory, but then you wouldn't be able to assign to _hash.
0 讨论(0)
发布评论:

提交评论
- 加载中...
小鲜肉

2020-12-23 17:59
edit for 2017: turns out namedtuple isn't a great idea. attrs is the modern alternative.
```
class Edge(EdgeBase):
    def __new__(cls, left, right):
        self = super(Edge, cls).__new__(cls, left, right)
        self._hash = hash(self.left) * hash(self.right)
        return self

    def __hash__(self):
        return self._hash
```
__new__ is what you want to call here because tuples are immutable. Immutable objects are created in __new__ and then returned to the user, instead of being populated with data in __init__.

cls has to be passed twice to the super call on __new__ because __new__ is, for historical/odd reasons implicitly a staticmethod.
0 讨论(0)
发布评论:

提交评论
- 加载中...