Python decoding from iso-8859-5

问题

When I parse my email messages via python email.parser.Parser, I had a lot of strings like this:

=?ISO-8859-5?Q?=C0=D5=D5=E1=E2=E0_=BF=DB=D0=E2=D5=D6=D5=D9_?=

How can i decode this to utf-8 using python?

回答1:

Your input is quoted-printable encoded text. You can use the module quopri to handle that:

import quopri

incode = '=?ISO-8859-5?Q?=C0=D5=D5=E1=E2=E0_=BF=DB=D0=E2=D5=D6=D5=D9_?='
inencoding = incode[2:12]  # 'ISO-8859-5'
intext = incode[15:-2]
result = quopri.decodestring(intext).encode(inencoding)

Result will then be

Реестр_Платежей

Around the quoted-printable encoding you additionally have an email-header formating, specifying the character encoding the string should be interpreted in after applying the quoted-printable decoding. The example code above substrings the portions "manually", but you also can solve all that in one step:

import email

[ (text, encoding) ] = email.header.decode_header(incode)
result = text.decode(encoding)

result now will again be the string given above.

来源：https://stackoverflow.com/questions/24080233/python-decoding-from-iso-8859-5

标签

python

python-2.7

encoding

decode

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!