Why Utf8 is compatible with ascii

后端 未结 3 1309
一向
一向 2021-02-08 04:16

A in UTF-8 is U+0041 LATIN CAPITAL LETTER A. A in ASCII is 065.

How is UTF-8 is backwards-compatible with ASCII?

3条回答
  •  滥情空心
    2021-02-08 04:39

    Unicode is backward compatible with ASCII because ASCII is a subset of Unicode. Unicode simply uses all character codes in ASCII and adds more.

    Although character codes are generally written out as 0041 in Unicode, the character codes are numeric so 0041 is the same value as (hexadecimal) 41.

    UTF-8 is not a character set but an encoding used with Unicode. It happens to be compatible with ASCII too, because the codes used for multiple byte encodings lie in the part of the ASCII character set that is unused.

    Note that it's only the 7-bit ASCII character set that is compatible with Unicode and UTF-8, the 8-bit character sets based on ASCII, like IBM850 and windows-1250, uses the part of the character set where UTF-8 has codes for multiple byte encodings.

提交回复
热议问题