How can I detect the encoding/codepage of a text file

后端未结

关注

 20  1405

梦如初夏 2020-11-21 22:42

In our application, we receive text files (.txt, .csv, etc.) from diverse sources. When reading, these files sometimes contain garbage, because the

20条回答

栀梦 (楼主)

2020-11-21 23:16
I know it's very late for this question and this solution won't appeal to some (because of its english-centric bias and its lack of statistical/empirical testing), but it's worked very well for me, especially for processing uploaded CSV data:

http://www.architectshack.com/TextFileEncodingDetector.ashx

Advantages:
- BOM detection built-in
- Default/fallback encoding customizable
- pretty reliable (in my experience) for western-european-based files containing some exotic data (eg french names) with a mixture of UTF-8 and Latin-1-style files - basically the bulk of US and western european environments.
Note: I'm the one who wrote this class, so obviously take it with a grain of salt! :)
0 讨论(0)

查看其它20个回答
发布评论:

提交评论
- 加载中...