Python: Replace non ascii characters in a list of strings

六月ゝ 毕业季﹏ 提交于 2019-12-24 14:06:20

问题


I understand there are many non ascii characters questions on stackoverflow but since I'm a total newb I've had no luck in successfully implementing them, plus I find the whole 'unicode' concept difficult to understand.

So I have a list -

mylist = ["apple", "samsung", "toshiba", "Don’t know", "Can’t recall"] 

I would like to access the single quote marks at index 3 and 4 and replace them with an apostrophe.

I tried this:

# -*- coding: utf-8 -*-
mylist = ["hello", "don't know", "Don’t know", "Can't recall"]
for word in mylist:
    word.replace(u"’", "'")
print mylist

I get the following error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 3: ordinal not in range(128)

Not sure if this is useful but I am using python version 2.x and I know that this problem may not occur if I was using version 3.

Thanks!


回答1:


>>> mylist = ["apple", "samsung", "toshiba", "Don’t know", "Can’t recall"]
>>> [item.replace('\xe2\x80\x99',"'") for item in mylist]
['apple', 'samsung', 'toshiba', "Don't know", "Can't recall"]

If all the items are already unicode:

>>> mylist = [u"apple", u"samsung", u"toshiba", u"Don’t know", u"Can’t recall"]
>>> [item.replace(u'’',u"'") for item in mylist]
[u'apple', u'samsung', u'toshiba', u"Don't know", u"Can't recall"]


来源:https://stackoverflow.com/questions/17273575/python-replace-non-ascii-characters-in-a-list-of-strings

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!