Unicode characters necessary for Japanese, Korean, and Chinese

后端 未结 3 1837
梦毁少年i
梦毁少年i 2021-01-25 02:14

I\'m trying to answer these basic questions without getting a degree in linguistics and early human history, which seems to be where every google search has lead.

3条回答
  •  野趣味
    野趣味 (楼主)
    2021-01-25 02:53

    It depend on how many coverage you want to give to each of those languages. Most commonly used characters in all these languages would only require a few thousands characters, but then once in a while you will encounter some characters outside the coverage. As you increase the number of characters supported by your system, you will be less likely to encounter these missing characters, until a point that you cover all the CJK characters.

    A common approach used by modern font developers, in order to cut time and effort in making font and yet support enough amount of characters so that it would display most fonts, is to use ranges given in pre-Unicode era character set like Big5(-HKSCS), GB2312 or 18030, and such as mentioned in comment of others' answer, but then it would be rather common to encounter characters that are not supported.

    In Unicode, something called IICore was made and defined about ten thousand characters that would be minimally essential to supporting these languages, and in Unicode database there are also info about whether they're essential to Chinese, Japanese, Korea or such, however nowadays barely anyone use them.

    Google and Adobe is now making the Noto CJK or known as Source Han fonts, which is supposed to cover as much CJK characters as example. However, due to limitation in file format, they can only put in about 65535 glyphs into the font and thus would have to adding/dropping characters in the process of making them.

    And at last, specifically for Korean, supporting only Hangul/Jamo is probably good enough in many cases because Hanja (the ideograph character) have been largely out of use other than in specialized area. Note that person names and some words in title could be part of these aspects that would still use Hanja so it depend if they're important to you or not

提交回复
热议问题