hash unicode string in python

后端 未结 3 2090
难免孤独
难免孤独 2021-02-06 21:07

I try to hash some unicode strings:

hashlib.sha1(s).hexdigest()
UnicodeEncodeError: \'ascii\' codec can\'t encode characters in position 0-81: 
ordinal not in ra         


        
3条回答
  •  情深已故
    2021-02-06 21:20

    Apparently hashlib.sha1 isn't expecting a unicode object, but rather a sequence of bytes in a str object. Encoding your unicode string to a sequence of bytes (using, say, the UTF-8 encoding) should fix it:

    >>> import hashlib
    >>> s = u'é'
    >>> hashlib.sha1(s.encode('utf-8'))
    
    

    The error is because it is trying to convert the unicode object to a str automatically, using the default ascii encoding, which can't handle all those non-ASCII characters (since your string isn't pure ASCII).

    A good starting point for learning more about Unicode and encodings is the Python docs, and this article by Joel Spolsky.

提交回复
热议问题