Writing Unicode text to a text file?

前端 未结 8 1970
眼角桃花
眼角桃花 2020-11-22 16:46

I\'m pulling data out of a Google doc, processing it, and writing it to a file (that eventually I will paste into a Wordpress page).

It has some non-ASCII symbols. H

相关标签:
8条回答
  • 2020-11-22 17:41

    That error arises when you try to encode a non-unicode string: it tries to decode it, assuming it's in plain ASCII. There are two possibilities:

    1. You're encoding it to a bytestring, but because you've used codecs.open, the write method expects a unicode object. So you encode it, and it tries to decode it again. Try: f.write(all_html) instead.
    2. all_html is not, in fact, a unicode object. When you do .encode(...), it first tries to decode it.
    0 讨论(0)
  • 2020-11-22 17:42

    Unicode string handling is already standardized in Python 3.

    1. char's are already stored in Unicode (32-bit) in memory
    2. You only need to open file in utf-8
      (32-bit Unicode to variable-byte-length utf-8 conversion is automatically performed from memory to file.)

      out1 = "(嘉南大圳 ㄐㄧㄚ ㄋㄢˊ ㄉㄚˋ ㄗㄨㄣˋ )"
      fobj = open("t1.txt", "w", encoding="utf-8")
      fobj.write(out1)
      fobj.close()
      
    0 讨论(0)
提交回复
热议问题