How does a file with Chinese characters know how many bytes to use per character?
问题 I have read Joel's article "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)" but still don't understand all the details. An example will illustrate my issues. Look at this file below: (source: yart.com.au) I have opened the file in a binary editor to closely examine the last of the three a's next to the first Chinese character: (source: yart.com.au) According to Joel: In UTF-8, every code point from 0-127 is stored