Python/Pandas : how to read a csv in cp1252 with a first row to delete?

前端 未结 1 553
鱼传尺愫
鱼传尺愫 2021-01-21 15:35

Solution :

See answer, it was not encoded in CP1252 but in UTF-16 . Solution code is :

import pandas as pd

df = p         


        
相关标签:
1条回答
  • 2021-01-21 15:39

    CP1252 is the plain old Latin codepage, which does support all Western European accents. There wouldn't be any garbled characters if the file was written in that codepage.

    The image of the data you posted is just that - an image. It says nothing about the file's raw format. Is it a UTF8 file? UTF16? It's definitely not CP1252.

    Neither UTF8 nor CP1252 would produce NANs either. Any single-byte codepage would read the numeric digits at least, which means the file is saved in a multi-byte encoding.

    The two strange characters at the start look like a Byte Order Mark. If you check Wikipedia's BOM entry you'll see that ÿþ is the BOM for UTF16LE.

    Try using utf-16 or utf-16-le instead of cp1252

    0 讨论(0)
提交回复
热议问题