问题
I am using Python 3.5 and imaplib
to fetch an e-mail from GMail and print its body. The body contains non-ASCII characters.
These are 'encoded' in a strange way and I cannot find out how to fix this.
import email
import imaplib
c = imaplib.IMAP4_SSL('imap.gmail.com')
c.login('example@gmail.com', 'password')
c.select('Inbox')
_, data = c.fetch(b'12345', '(RFC822)')
mail = data[0][1]
message = email.message_from_bytes(mail)
payload = message.get_payload()
body = mail[0].as_string()
print(body)
Gives
>> ... Mit freundlichen Gr=C3=BC=C3=9Fen ...
instead of the desired
>> ... Mit freundlichen Grüßen ...
It looks to me like this is not an issue of encoding but one of conversion. But how do I tell Python to convert the characters correctly? Is there a more convenient library?
回答1:
The text is encoded with quoted-printable encoding, which is a way to encode non-ascii characters in ascii text. You can decode it using python's quopri module.
>>> import quopri
>>> bs = b'Gr=C3=BC=C3=9Fen'
>>> # Decode quoted-printable to raw bytes.
>>> utf8 = quopri.decodestring(bs)
>>> # Decode bytes to text.
>>> s = utf8.decode('utf-8')
>>> print(s)
Grüßen
You may find that quoted-printable
is the value of the email's content-transfer-encoding
header.
来源:https://stackoverflow.com/questions/53871898/python-imaplib-display-non-ascii-characters-correctly