发表新帖

发表新帖

What's the difference between UTF-8 and UTF-8 without BOM?

前端未结

关注

 21  1485

佛祖请我去吃肉 2020-11-21 05:45

What\'s different between UTF-8 and UTF-8 without a BOM? Which is better?

21条回答

予麋鹿 (楼主)

2020-11-21 06:15
There are at least three problems with putting a BOM in UTF-8 encoded files.
1. Files that hold no text are no longer empty because they always contain the BOM.
2. Files that hold text that is within the ASCII subset of UTF-8 is no longer themselves ASCII because the BOM is not ASCII, which makes some existing tools break down, and it can be impossible for users to replace such legacy tools.
3. It is not possible to concatenate several files together because each file now has a BOM at the beginning.
And, as others have mentioned, it is neither sufficient nor necessary to have a BOM to detect that something is UTF-8:
- It is not sufficient because an arbitrary byte sequence can happen to start with the exact sequence that constitutes the BOM.
- It is not necessary because you can just read the bytes as if they were UTF-8; if that succeeds, it is, by definition, valid UTF-8.
0 讨论(0)

查看其它21个回答
发布评论:

提交评论
- 加载中...

热议问题