'ascii' codec can't encode character at position * ord not in range(128)

后端 未结 2 666
醉酒成梦
醉酒成梦 2021-02-14 16:28

There are a few threads on stackoverflow, but i couldn\'t find a valid solution to the problem as a whole.

I have collected huge sums of textual data from the urllib rea

相关标签:
2条回答
  • 2021-02-14 16:53

    Your data is unicode data. To write that to a file, use .encode():

    text = text.encode('ascii', 'ignore')
    

    but that would remove anything that isn't ASCII. Perhaps you wanted to encode to a more suitable encoding, like UTF-8, instead?

    You may want to read up on Python and Unicode:

    • The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky

    • The Python Unicode HOWTO

    • Pragmatic Unicode by Ned Batchelder

    0 讨论(0)
  • 2021-02-14 17:05

    You can do it through smart_str of Django module. Just try this:

    from django.utils.encoding import smart_str, smart_unicode
    
    text = u'\u2019'
    print smart_str(text)
    

    You can install Django by starting a command shell with administrator privileges and run this command:

    pip install Django
    
    0 讨论(0)
提交回复
热议问题