How can I detect the encoding/codepage of a text file

后端 未结 20 1430
梦如初夏
梦如初夏 2020-11-21 22:42

In our application, we receive text files (.txt, .csv, etc.) from diverse sources. When reading, these files sometimes contain garbage, because the

20条回答
  •  星月不相逢
    2020-11-21 23:25

    Since it basically comes down to heuristics, it may help to use the encoding of previously received files from the same source as a first hint.

    Most people (or applications) do stuff in pretty much the same order every time, often on the same machine, so its quite likely that when Bob creates a .csv file and sends it to Mary it'll always be using Windows-1252 or whatever his machine defaults to.

    Where possible a bit of customer training never hurts either :-)

提交回复
热议问题