Converting Unicode objects with non-ASCII symbols in them into strings objects (in Python)

[亡魂溺海] 提交于 2019-12-05 14:24:24

When you get a unicode object and want to return a UTF-8 encoded byte string from it, use theobject.encode('utf8').

It seems strange that you don't know whether the incoming object is a str or unicode -- surely you do control the call sites to that function, too?! But if that is indeed the case, for whatever weird reason, you may need something like:

def ensureutf8(s):
    if isinstance(s, unicode):
        s = s.encode('utf8')
    return s

which only encodes conditionally, that is, if it receives a unicode object, not if the object it receives is already a byte string. It returns a byte string in either case.

BTW, part of your confusion seems to be due to the fact that you don't know that just entering an expression at the interpreter prompt will show you its repr, which is not the same effect you get with print;-).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!