Read special characters from .txt file in python

前端 未结 3 498
滥情空心
滥情空心 2021-01-19 21:56

The goal of this code is to find the frequency of words used in a book.

I am tying to read in the text of a book but the following line keeps throwing my code off:

相关标签:
3条回答
  • 2021-01-19 22:06

    When you open a text file in python, the encoding is ANSI by default, so it doesn't contain your é chartecter. Try

    word_file = open ("./words.txt", "r", encoding='utf-8')
    
    0 讨论(0)
  • 2021-01-19 22:25

    Try:

    def parseString(st):
        st = st.encode("ascii", "replace")
    
        # rest of code here
    

    The new error you are getting is because you are calling isalpha on an int (i.e. a number)

    Try this:

    for ch in st:
        ch = str(ch)
        if (n for n in (1,2,3,4,5,6,7,8,9,0) if n in ch) or ' ' in ch or ch.isspace() or ch == u'\xe9':
    
            print (ch)
    
    0 讨论(0)
  • 2021-01-19 22:26

    The best way I could think of is to read each character as an ASCII value, into an array, and then take the char value. For example, 97 is ASCII for "a" and if you do char(97) it will output "a". Check out some online ASCII tables that provide values for special characters also.

    0 讨论(0)
提交回复
热议问题