Character showing up as diamond question mark only at end of line (Python>Text)

问题

I'm working on a Python file that inputs a text file with Japanese characters (UTF-8) in it, takes some of the text, and writes it into a new UTF-8 text file.

The problem I'm coming across is that for some reason whenever the Japanese character だ appears at the end of a line in the original input file, it comes out as a diamond question mark in the output file.

Instances of だ before the end of a line read perfectly fine and the original input file has it reading perfectly fine even if it's at the end of the line.

回答1:

As you haven't shared any code snippet I would recommend you a generic way of reading and writing utf-8 files using the codecs module as:

# Reading utf-8 encoded file
with codecs.open("in.txt", "r", encoding="utf-8") as input_data:
    data = input_data.read()

# Write utf-8 encoded file
with codecs.open("out.txt", "w", encoding="utf-8") as output_data:
     output_data.write(data)

And BTW I tested it on the given character だ and it works pretty fine.

来源：https://stackoverflow.com/questions/41812162/character-showing-up-as-diamond-question-mark-only-at-end-of-line-pythontext

标签

python

text

character

utf

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!