Python, how to print Japanese, Korean, Chinese strings

前端 未结 5 1875
情书的邮戳
情书的邮戳 2021-01-13 12:55

In Python, for Japanese, Chinese, and Korean,Python can not print the correct strings, for example hello in Japanese, Korean and Chinese are:

こん         


        
相关标签:
5条回答
  • 2021-01-13 13:26

    What you see is the difference between

    1. Printing a string
    2. Printing a list

    Or more generally, the difference between an objects "informal" and "official" string representation (see documentation).

    In the first case, the unicode string will be printed correctly, as you would expect, with the unicode characters.

    In the second case, the items of the list will be printed using their representation and not their string value.

    for line in f.readlines():
        print line
    

    is the first (good) case, and

    print f.readlines()
    

    is the second case.

    You can check the difference by this example:

     a = u'ð€œłĸªßð'
     print a
     print a.__repr__()
     l = [a, a]
     print l
    

    This shows the difference between the special __str__() and __repr__() methods which you can play with yourself.

    class Person(object):
        def __init__(self, name):
            self.name = name
        def __str__(self):
            return self.name
        def __repr__(self):
            return '<Person name={}>'.format(self.name)
    
    p = Person('Donald')
    print p  #  Prints 'Donald' using __str__
    p # On the command line, prints '<Person name=Donald>' using __repr__
    

    I.e., the value you see when simply typing an object name on the console is defined by __repr__ while what you see when you use print is defined by __str__.

    0 讨论(0)
  • 2021-01-13 13:30

    My python version 2.7.11 and operating system is Mac OSX,I write

    こんにちは
    안녕하세요
    你好
    

    to test.txt. My program is :

    # -*-coding:utf-8-*-
    
    import json
    
    
    if __name__ == '__main__':
        f = open("./test.txt", "r")
        a = f.readlines()
        print json.dumps(a, ensure_ascii=False)
        f.close()
    

    run the program, result:

    ["こんにちは\n", "안녕하세요\n", "你好"]
    
    0 讨论(0)
  • 2021-01-13 13:32

    I was also bothered by the same problem.
    It is certainly the limitation of the font you are using.
    It is set to "Consolas" by default.

    You can change it to "MS Gothic" or "NSimSun". I personally prefer the latter. Both of them are capable of displaying Japanese/Chinese characters, but ensure that your system encoding is set to utf-8 as mentioned by sami in the above answer.

    To change font in cmd, do:

    1. Click on the cmd icon on top left of cmd window.
    2. A drop-down menu appears. Select properties.
    3. Select the font you prefer from the list shown in the second section.
    4. Click OK.
    0 讨论(0)
  • 2021-01-13 13:33

    First you need to read the text as unicode

    import codecs
    f = codecs.open('test.txt','r','utf-8')
    

    Second

    When you print you should encode it like this

    unicodeText.encode('utf-8')
    

    Third

    you should insure that your console support unicode display

    Use

    print sys.getdefaultencoding()
    

    if it doesn't try

    reload(sys)
    sys.setdefaultencoding('utf-8')
    
    0 讨论(0)
  • 2021-01-13 13:46

    Try this:

    import codecs
    
    fp = codecs.open('test.txt', encoding='utf-8')
    
    for line in fp:
        print line
    
    0 讨论(0)
提交回复
热议问题