Reading UTF-8 Encoded Files and Text Files in Python3

前端未结

关注

 2  1116

Ok, so python3 and unicode. I know that all python3 strings are actually unicode strings and all python3 code is stored as utf-8. But how does python3 reads text files? Does

相关标签:

2条回答

囚心锁ツ

2020-12-11 18:05

Do I need to call decode('utf-8') when reading a text file?

You need to try-read a text file to make sure it's utf-8 encoding in the file.

0 讨论(0)
发布评论:

提交评论
- 加载中...
时光取名叫无心

2020-12-11 18:07
Python's built-in function open() has an optional parameter encoding:

encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode. The default encoding is platform dependent (whatever locale.getpreferredencoding() returns), but any text encoding supported by Python can be used. See the codecs module for the list of supported encodings.

Analogous parameter could be found in pandas:
- pandas.read_csv(): encoding: str, default None. Encoding to use for UTF when reading/writing (ex. ‘utf-8’).
- Series.to_csv(): encoding: string, optional. A string representing the encoding to use if the contents are non-ascii, for python versions prior to 3.
- DataFrame.to_csv(): encoding: string, optional. A string representing the encoding to use in the output file, defaults to ‘ascii’ on Python 2 and ‘utf-8’ on Python 3.
0 讨论(0)
发布评论:

提交评论
- 加载中...