What is the difference between UTF-8 and Unicode?

前端 未结 15 1024
独厮守ぢ
独厮守ぢ 2020-11-22 17:08

I have heard conflicting opinions from people - according to the Wikipedia UTF-8 page.

They are the same thing, aren\'t they? Can someone clarify?

15条回答
  •  遇见更好的自我
    2020-11-22 17:27

    They are the same thing, aren't they?

    No, they aren't.


    I think the first sentence of the Wikipedia page you referenced gives a nice, brief summary:

    UTF-8 is a variable width character encoding capable of encoding all 1,112,064 valid code points in Unicode using one to four 8-bit bytes.

    To elaborate:

    • Unicode is a standard, which defines a map from characters to numbers, the so-called code points, (like in the example below). For the full mapping, you can have a look here.

      ! -> U+0021 (21),  
      " -> U+0022 (22),  
      \# -> U+0023 (23)
      
    • UTF-8 is one of the ways to encode these code points in a form a computer can understand, aka bits. In other words, it's a way/algorithm to convert each of those code points to a sequence of bits or convert a sequence of bits to the equivalent code points. Note that there are a lot of alternative encodings for Unicode.


    Joel gives a really nice explanation and an overview of the history here.

提交回复
热议问题