What's the difference between UTF-8 and UTF-8 without BOM?

前端 未结 21 1388
佛祖请我去吃肉
佛祖请我去吃肉 2020-11-21 05:45

What\'s different between UTF-8 and UTF-8 without a BOM? Which is better?

21条回答
  •  误落风尘
    2020-11-21 06:15

    The Unicode Byte Order Mark (BOM) FAQ provides a concise answer:

    Q: How I should deal with BOMs?

    A: Here are some guidelines to follow:

    1. A particular protocol (e.g. Microsoft conventions for .txt files) may require use of the BOM on certain Unicode data streams, such as files. When you need to conform to such a protocol, use a BOM.

    2. Some protocols allow optional BOMs in the case of untagged text. In those cases,

      • Where a text data stream is known to be plain text, but of unknown encoding, BOM can be used as a signature. If there is no BOM, the encoding could be anything.

      • Where a text data stream is known to be plain Unicode text (but not which endian), then BOM can be used as a signature. If there is no BOM, the text should be interpreted as big-endian.

    3. Some byte oriented protocols expect ASCII characters at the beginning of a file. If UTF-8 is used with these protocols, use of the BOM as encoding form signature should be avoided.

    4. Where the precise type of the data stream is known (e.g. Unicode big-endian or Unicode little-endian), the BOM should not be used. In particular, whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE or UTF-32LE a BOM must not be used.

提交回复
热议问题