I try to hash some unicode strings:
hashlib.sha1(s).hexdigest()
UnicodeEncodeError: \'ascii\' codec can\'t encode characters in position 0-81:
ordinal not in ra
Apparently hashlib.sha1
isn't expecting a unicode
object, but rather a sequence of bytes in a str
object. Encoding your unicode
string to a sequence of bytes (using, say, the UTF-8 encoding) should fix it:
>>> import hashlib
>>> s = u'é'
>>> hashlib.sha1(s.encode('utf-8'))
<sha1 HASH object @ 029576A0>
The error is because it is trying to convert the unicode
object to a str
automatically, using the default ascii
encoding, which can't handle all those non-ASCII characters (since your string isn't pure ASCII).
A good starting point for learning more about Unicode and encodings is the Python docs, and this article by Joel Spolsky.
Use encoding format utf-8
, Try this easy way,
>>> import hashlib
>>> hashlib.sha256(str(random.getrandbits(256)).encode('utf-8')).hexdigest()
'cd183a211ed2434eac4f31b317c573c50e6c24e3a28b82ddcb0bf8bedf387a9f'
You hash bytes
, not strings
. So you gotta know what bytes you really want to hash, for example an utf8 memory representation of the string or a utf16 memory representation of the string, etc.