UnicodeDecodeError when reading CSV file in Pandas with Python

后端 未结 21 2267
野趣味
野趣味 2020-11-22 04:27

I\'m running a program which is processing 30,000 similar files. A random number of them are stopping and producing this error...

File "C:\\Importer\\src         


        
21条回答
  •  难免孤独
    2020-11-22 05:05

    This answer seems to be the catch-all for CSV encoding issues. If you are getting a strange encoding problem with your header like this:

    >>> f = open(filename,"r")
    >>> reader = DictReader(f)
    >>> next(reader)
    OrderedDict([('\ufeffid', '1'), ... ])
    

    Then you have a byte order mark (BOM) character at the beginning of your CSV file. This answer addresses the issue:

    Python read csv - BOM embedded into the first key

    The solution is to load the CSV with encoding="utf-8-sig":

    >>> f = open(filename,"r", encoding="utf-8-sig")
    >>> reader = DictReader(f)
    >>> next(reader)
    OrderedDict([('id', '1'), ... ])
    

    Hopefully this helps someone.

提交回复
热议问题