I want to make custom object hash-able (via pickling). I could find __hash__
algorithm for Python 2.x (see code below), but it obviously differs
I looked up the new function in the source (in unicodeobject.c) and rebuilt it in Python. Here it is:
def my_hash(string):
x = ord(string[0]) << 7
for c in string:
x = (1000003 * x) ^ ord(c)
x ^= len(string)
needCorrection = x & (1 << 65)
x %= 2 ** 64
if needCorrection:
x = -~(-x ^ 0xFFFFFFFFFFFFFFFF)
if x == -1:
x = -2
return x
This is 64-bit only, though. Now with correction for Python's weird behavior when numbers become negative. (You better don't think about this too much.)
The answer why they are different is written there:
Hash values are now values of a new type, Py_hash_t, which is defined to be the same size as a pointer. Previously they were of type long, which on some 64-bit operating systems is still only 32 bits long.
The hashing also consider new values to be calculate, take a look at
sys.hash_info
For strings, you can take a look at http://svn.python.org/view/python/trunk/Objects/stringobject.c?view=markup line 1263 string_hash(PyStringObject *a)