UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 6: ordinal not in range(128)

一笑奈何 提交于 2019-12-07 01:40:41

问题


I am trying to pull a list of 500 restaurants in Amsterdam from TripAdvisor; however after the 308th restaurant I get the following error:

Traceback (most recent call last):
  File "C:/Users/dtrinh/PycharmProjects/TripAdvisorData/LinkPull-HK.py", line 43, in <module>
    writer.writerow(rest_array)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 6: ordinal not in range(128)

I tried several things I found on StackOverflow, but nothing is working as of right now. I was wondering if someone could take a look at my code and see any potential solutions that would be great.

        for item in soup2.findAll('div', attrs={'class', 'title'}):
            if 'Cuisine' in item.text:
                item.text.strip()
                content = item.findNext('div', attrs=('class', 'content'))
                cuisine_type = content.text.encode('utf8', 'ignore').strip().split(r'\xa0')
        rest_array = [account_name, rest_address, postcode, phonenumber, cuisine_type]
        #print rest_array
        with open('ListingsPull-Amsterdam.csv', 'a') as file:
                writer = csv.writer(file)
                writer.writerow(rest_array)
    break

回答1:


The rest_array contains unicode strings. When you use csv.writer to write rows, you need to serialise bytes strings (you are on Python 2.7).

I suggest you to use "utf8" encoding:

with open('ListingsPull-Amsterdam.csv', mode='a') as fd:
    writer = csv.writer(fd)
    rest_array = [text.encode("utf8") for text in rest_array]
    writer.writerow(rest_array)

note: please, don't use file as variable because you shadow the built-in function file() (an alias of open() function).

If you want to open this CSV file with Microsoft Excel, you may consider using another encoding, for instance "cp1252" (it allows u"\u2019" character).




回答2:


You're writing a non-ascii character(s) to your csv output file. Make sure you open the output file with the appropriate character encoding that allows for the character(s) to be encoded. A safe bet is often UTF-8. Try this:

with open('ListingsPull-Amsterdam.csv', 'a', encoding='utf-8') as file:
    writer = csv.writer(file)
    writer.writerow(rest_array)

edit this is for Python 3.x, sorry.



来源:https://stackoverflow.com/questions/40619675/unicodeencodeerror-ascii-codec-cant-encode-character-u-u2019-in-position-6

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!