问题
I am trying to encode and decode the Hebrew string "שלום". However, after encoding, I get gibberish:
>>> word = "שלום"
>>> word = word.decode('UTF-8')
>>> word
u'\u05e9\u05dc\u05d5\u05dd'
>>> print word
שלום
>>> word = word.encode('UTF-8')
>>> word
'\xd7\xa9\xd7\x9c\xd7\x95\xd7\x9d'
>>> print word
׳©׳׳•׳
How should I do it propely?
Thanks.
回答1:
You'll have to make sure you have the right encoding in your environment (shell or script). If you're using a script include the following:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
To make sure your environment knows you're using UTF-8. You may find that your shell terminal will accept only ASCII, so make sure it is able to support UTF-8.
>>> word = "שלום"
>>> word
'\xd7\xa9\xd7\x9c\xd7\x95\xd7\x9d'
>>> print word
שלום
>>> word = word.decode('UTF-8')
>>> word
u'\u05e9\u05dc\u05d5\u05dd'
>>> print word
שלום
>>> word = word.encode('UTF-8')
>>> word
'\xd7\xa9\xd7\x9c\xd7\x95\xd7\x9d'
>>> print word
שלום
>>>
来源:https://stackoverflow.com/questions/29850912/decoding-and-encoding-hebrew-string-in-python