I have some problems understanding how numpy objects hashability is managed.
>>> import numpy as np
>>> class Vector(np.ndarray):
... p
I get the same results in Python 2.6.6 and numpy 1.3.0. According to the Python glossary, an object should be hashable if __hash__
is defined (and is not None
), and either __eq__
or __cmp__
is defined. ndarray.__eq__
and ndarray.__hash__
are both defined and return something meaningful, so I don't see why hash
should fail. After a quick google, I found this post on the python.scientific.devel mailing list, which states that arrays have never been intended to be hashable - so why ndarray.__hash__
is defined, I have no idea. Note that isinstance(nparray, collections.Hashable)
returns True
.
EDIT: Note that nparray.__hash__()
returns the same as id(nparray)
, so this is just the default implementation. Maybe it was difficult or impossible to remove the implementation of __hash__
in earlier versions of python (the __hash__ = None
technique was apparently introduced in 2.6), so they used some kind of C API magic to achieve this in a way that wouldn't propagate to subclasses, and wouldn't stop you from calling ndarray.__hash__
explicitly?
Things are different in Python 3.2.2 and the current numpy 2.0.0 from the repo. The __cmp__
method no longer exists, so hashability now requires __hash__
and __eq__
(see Python 3 glossary). In this version of numpy, ndarray.__hash__
is defined, but it is just None
, so cannot be called. hash(nparray)
fails andisinstance(nparray, collections.Hashable)
returns False
as expected. hash(vector)
also fails.