When to use Unicode Normalization Forms NFC and NFD?

后端 未结 2 1169
温柔的废话
温柔的废话 2021-02-05 05:24

The Unicode Normalization FAQ includes the following paragraph:

Programs should always compare canonical-equivalent Unicode strings as equal ... The Unico

2条回答
  •  [愿得一人]
    2021-02-05 05:42

    1. NFC is the general common sense form that you should use, ä is 1 code point there and that makes sense.

    2. NFD is good for certain internal processing - if you want to make accent-insensitive searches or sorting, having your string in NFD makes it much easier and faster. Another usage is making more robust slug titles. These are just the most obvious ones, I am sure there are plenty of more uses.

    3. If two strings x and y are canonical equivalents, then
      toNFC(x) = toNFC(y)
      toNFD(x) = toNFD(y)

      Is that what you meant?

提交回复
热议问题