问题
Using the Python programming language, I'm having trouble outputting characters such as å, ä and ö. The following code gives me a question mark (?) as output, not an å:
#coding: iso-8859-1
input = "å"
print input
The following code lets you input random text. The for-loop goes through each character of the input, adds them to the string variable a and then outputs the resulting string. This code works correctly; you can input å, ä and ö and the output will still be correct. For example, "år" outputs "år" as expected.
#coding: iso-8859-1
input = raw_input("Test: ")
a = ""
for i in range(0, len(input)):
a = a + input[i]
print a
What's interesting is that if I change input = raw_input("Test: ")
to input = "år"
, it will output a question mark (?) for the "å".
#coding: iso-8859-1
input = "år"
a = ""
for i in range(0, len(input)):
a = a + input[i]
print a
For what it's worth, I'm using TextWrangler, and my document's character encoding is set to ISO Latin 1. What causes this? How can I solve the problem?
回答1:
You're using Python 2, I assume running on a platform like Linux that encodes I/O in UTF-8.
Python 2's ""
literals represent byte-strings. So when you specify "år"
in your ISO 8859-1-encoded source file, the variable input
has the value b'\xe5r'
. When you print
this, the raw bytes are output to the console, but show up as a question-mark because they are not valid UTF-8.
To demonstrate, try it with print repr(a)
instead of print a
.
When you use raw_input()
, the user's input is already UTF-8-encoded, and so are correctly output.
To fix this, either:
Encode your string as UTF-8 before printing it:
print a.encode('utf-8')
Use Unicode strings (
u'text'
) instead of byte-strings. You will need to be careful with decoding the input, since on Python 2,raw_input()
returns a byte-string rather than a text string. If you know the input is UTF-8, useraw_input().decode('utf-8')
.Encode your source file in UTF-8 instead of iso-8859-1. Then the byte-string literal will already be in UTF-8.
来源:https://stackoverflow.com/questions/19882935/special-characters-appearing-as-question-marks