I\'m pulling data out of a Google doc, processing it, and writing it to a file (that eventually I will paste into a Wordpress page).
It has some non-ASCII symbols. H
That error arises when you try to encode a non-unicode string: it tries to decode it, assuming it's in plain ASCII. There are two possibilities:
f.write(all_html)
instead..encode(...)
, it first tries to decode it.Unicode string handling is already standardized in Python 3.
You only need to open file in utf-8
(32-bit Unicode to variable-byte-length utf-8 conversion is automatically performed from memory to file.)
out1 = "(嘉南大圳 ㄐㄧㄚ ㄋㄢˊ ㄉㄚˋ ㄗㄨㄣˋ )"
fobj = open("t1.txt", "w", encoding="utf-8")
fobj.write(out1)
fobj.close()