Unicode letters with more than 1 alphabetic latin character?

我只是一个虾纸丫 提交于 2020-02-06 18:59:31

问题


I'm not really sure how to express it but I'm searching for unicode letters which are more than one visual latin letter.

I found this in Word so far:

  • DZ
  • Dz
  • dz
  • NJ
  • Lj
  • LJ
  • Nj
  • nj

Any others?


回答1:


Sorry about the formatting because it's hard to map long characters to monospace fonts' letter widths. It would be better if it's in a picture but then there's no possibility to copy and zoom infinitely

Digraphs

+-------------+----------+-----------------------+-------------------------+
| Two Glyphs  | Digraph  |  Unicode Code Point   |          HTML           |
+-------------+----------+-----------------------+-------------------------+
| DZ, Dz, dz  | DZ, Dz, dz  | U+01F1 U+01F2 U+01F3  | DZ Dz dz |
| DŽ, Dž, dž  | DŽ, Dž, dž  | U+01C4 U+01C5 U+01C6  | DŽ Dž dž |
| IJ, ij      | IJ, ij     | U+0132 U+0133         | IJ ij         |
| LJ, Lj, lj  | LJ, Lj, lj  | U+01C7 U+01C8 U+01C9  | LJ Lj lj |
| NJ, Nj, nj  | NJ, Nj, nj  | U+01CA U+01CB U+01CC  | NJ Nj nj |
+-------------+----------+-----------------------+-------------------------+

Ligatures

+--------------------+---------------+-----------------+-------------------+
|   Non-ligature     | Ligature[27]  |    Unicode      |       HTML        |
+--------------------+---------------+-----------------+-------------------+
| AA, aa             | Ꜳ, ꜳ        | U+A732, U+A733  | Ꜳ ꜳ |
| AE, ae             | Æ, æ          | U+00C6, U+00E6  | Æ æ   |
| AO, ao             | Ꜵ, ꜵ        | U+A734, U+A735  | Ꜵ ꜵ |
| AU, au             | Ꜷ, ꜷ         | U+A736, U+A737  | Ꜷ ꜷ |
| AV, av             | Ꜹ, ꜹ         | U+A738, U+A739  | Ꜹ ꜹ |
| AV, av (with bar)  | Ꜻ, ꜻ         | U+A73A, U+A73B  | Ꜻ ꜻ |
| AY, ay             | Ꜽ, ꜽ         | U+A73C, U+A73D  | Ꜽ ꜽ |
| et                 | 🙰            | U+1F670         | 🙰         |
| f‌f                 | ff             | U+FB00          | ff          |
| f‌f‌i                | ffi             | U+FB03          | ffi          |
| f‌f‌l                | ffl             | U+FB04          | ffl          |
| f‌i                 | fi             | U+FB01          | fi          |
| f‌l                 | fl             | U+FB02          | fl          |
| OE, oe             | Œ, œ          | U+0152, U+0153  | Œ œ   |
| OO, oo             | Ꝏ, ꝏ        | U+A74E, U+A74F  | Ꝏ ꝏ |
| ſs, ſz             | ẞ, ß          | U+1E9E, U+00DF  | ß           |
| st                 | st             | U+FB06          | st          |
| ſt                 | ſt             | U+FB05          | ſt          |
| TZ, tz             | Ꜩ, ꜩ         | U+A728, U+A729  | Ꜩ ꜩ |
| ue                 | ᵫ             | U+1D6B          | ᵫ          |
| VY, vy             | Ꝡ, ꝡ         | U+A760, U+A761  | Ꝡ ꝡ |
+--------------------+---------------+-----------------+-------------------+

There are a few other ligatures that are used for phonetic transcription but looks like Latin characters

+--+---------------+---------------+-----------------+-----------------+
|  | Non-ligature  | Ligature[27]  |    Unicode      |      HTML       |
+--+---------------+---------------+-----------------+-----------------+
|  | db            | ȸ             | U+0238          | ȸ         |
|  | dz            | ʣ             | U+02A3          | ʣ         |
|  | IJ, ij        | IJ, ij          | U+0132, U+0133  | IJ ij |
|  | ls            | ʪ             | U+02AA          | ʪ         |
|  | lz            | ʫ             | U+02AB          | ʫ         |
|  | qp            | ȹ             | U+0239          | ȹ         |
|  | ts            | ʦ             | U+02A6          | ʦ         |
|  | ui            | ꭐ             | U+AB50          | ꭐ        |
|  | turned ui     | ꭑ             | U+AB51          | ꭐ        |
+--+---------------+---------------+-----------------+-----------------+

https://en.wikipedia.org/wiki/List_of_precomposed_Latin_characters_in_Unicode#Digraphs_and_ligatures


Edit:

There are more letterlike symbols beside ℻ and ℡ like what the OP found in the comment:

℀ ℁ ⅍ ℅ ℆ ℔ ℠ ™

Longer letters are mainly from the CJK Compatibility block

U+338x  ㎀   ㎁   ㎂   ㎃   ㎄   ㎅   ㎆   ㎇   ㎈   ㎉   ㎊   ㎋   ㎌   ㎍   ㎎   ㎏
U+339x  ㎐   ㎑   ㎒   ㎓   ㎔   ㎕   ㎖   ㎗   ㎘   ㎙   ㎚   ㎛   ㎜   ㎝   ㎞   ㎟
U+33Ax  ㎠   ㎡   ㎢   ㎣   ㎤   ㎥   ㎦   ㎧   ㎨   ㎩   ㎪   ㎫   ㎬   ㎭   ㎮   ㎯
U+33Bx  ㎰   ㎱   ㎲   ㎳   ㎴   ㎵   ㎶   ㎷   ㎸   ㎹   ㎺   ㎻   ㎼   ㎽   ㎾   ㎿
U+33Cx  ㏀   ㏁   ㏂   ㏃   ㏄   ㏅   ㏆   ㏇   ㏈   ㏉   ㏊   ㏋   ㏌   ㏍   ㏎   ㏏
U+33Dx  ㏐   ㏑   ㏒   ㏓   ㏔   ㏕   ㏖   ㏗   ㏘   ㏙   ㏚   ㏛   ㏜   ㏝   ㏞   ㏟

Among the 3-letter-like symbols are ㎈ ㎑ ㎒ ㎓ ㎔㏒ ㏕ ㏖ ㏙ ㎪ ㎫ ㎬ ㎭ ㏆ ㏿ ㍱... Probably the ones with most characters are ㎉ and ㎯

Unicode even have codepoints for Roman numerals. Here another 4-letter-like character can be found: Ⅷ

        0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F
U+215x  ⅐   ⅑   ⅒   ⅓   ⅔   ⅕   ⅖   ⅗   ⅘   ⅙   ⅚   ⅛   ⅜   ⅝   ⅞   ⅟
U+216x  Ⅰ   Ⅱ   Ⅲ   Ⅳ   Ⅴ   Ⅵ   Ⅶ   Ⅷ Ⅸ     Ⅹ   Ⅺ   Ⅻ   Ⅼ   Ⅽ   Ⅾ   Ⅿ
U+217x  ⅰ   ⅱ   ⅲ   ⅳ   ⅴ   ⅵ   ⅶ   ⅷ   ⅸ   ⅹ   ⅺ   ⅻ   ⅼ   ⅽ   ⅾ   ⅿ
U+218x  ↀ   ↁ   ↂ   Ↄ   ↄ   ↅ   ↆ   ↇ   ↈ   ↉   ↊   ↋               

If normal numbers can be considered then there are some other codepoints for multiple digits like ⒆ ⒇ ⓳ ⓴ in enclosed alphanumerics

        0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F
U+246x  ①   ②   ③   ④   ⑤   ⑥   ⑦   ⑧   ⑨   ⑩   ⑪   ⑫   ⑬   ⑭   ⑮   ⑯
U+247x  ⑰   ⑱   ⑲   ⑳   ⑴   ⑵   ⑶   ⑷   ⑸   ⑹   ⑺   ⑻   ⑼   ⑽   ⑾   ⑿
U+248x  ⒀   ⒁   ⒂   ⒃   ⒄   ⒅   ⒆   ⒇   ⒈   ⒉   ⒊   ⒋   ⒌   ⒍   ⒎   ⒏
U+249x  ⒐   ⒑   ⒒   ⒓   ⒔   ⒕   ⒖   ⒗   ⒘   ⒙   ⒚   ⒛   ⒜   ⒝   ⒞   ⒟
U+24Ax  ⒠   ⒡   ⒢   ⒣   ⒤   ⒥   ⒦   ⒧   ⒨   ⒩   ⒪   ⒫   ⒬   ⒭   ⒮   ⒯
U+24Bx  ⒰   ⒱   ⒲   ⒳   ⒴   ⒵   Ⓐ   Ⓑ   Ⓒ   Ⓓ   Ⓔ   Ⓕ   Ⓖ   Ⓗ   Ⓘ   Ⓙ
U+24Cx  Ⓚ   Ⓛ   Ⓜ   Ⓝ   Ⓞ   Ⓟ   Ⓠ   Ⓡ   Ⓢ   Ⓣ   Ⓤ   Ⓥ   Ⓦ   Ⓧ   Ⓨ   Ⓩ
U+24Dx  ⓐ   ⓑ   ⓒ   ⓓ   ⓔ   ⓕ   ⓖ   ⓗ   ⓘ   ⓙ   ⓚ   ⓛ   ⓜ   ⓝ   ⓞ   ⓟ
U+24Ex  ⓠ   ⓡ   ⓢ   ⓣ   ⓤ   ⓥ   ⓦ   ⓧ   ⓨ   ⓩ   ⓪   ⓫   ⓬   ⓭   ⓮   ⓯
U+24Fx  ⓰   ⓱   ⓲   ⓳   ⓴   ⓵   ⓶   ⓷   ⓸   ⓹   ⓺   ⓻   ⓼   ⓽   ⓾   ⓿

and in Enclosed Alphanumeric Supplement

🅫, 🅪, 🆋, 🆌, 🆍, 🄭, 🄮, 🅊, 🅋, 🅌, 🅍, 🅎, 🅏

A few more:

Currency symbol group

₧ ₨ ₶ ₯ ₠ ₢ ₷

Miscellaneous technical group

⎂ ⏨

Control pictures (probably you'll need to zoom out to see)

        0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F
U+240x  ␀   ␁   ␂   ␃   ␄   ␅   ␆   ␇   ␈   ␉   ␊   ␋   ␌   ␍   ␎   ␏
U+241x  ␐   ␑   ␒   ␓   ␔   ␕   ␖   ␗   ␘   ␙   ␚   ␛   ␜   ␝   ␞   ␟
U+242x  ␠   ␡   ␢   ␣   ␤   ␥   ␦                                   

Alchemical Symbols

🜀 🜅 🜆 🜇 🜈 🝪 🝫 🝬 🝛 🝜 🝝

Musical Symbols

𝄶 𝄷 𝄸 𝄹 𝄉 𝄊 𝄫

And there are the emojis 🔟 💤🆔🚾🆖🆗🔢🔡🔠 💯🆘🆎🆑™🔙🔚🔜🔝🔛📆🗓🔞

Vertical bars may be considered uppercase i or lowercase L (like your 〷 example which is actually the TELEGRAPH LINE FEED SEPARATOR SYMBOL) and we have

  • Vai syllable see ꔖ 0xa516
  • Large triple vertical bar operator ⫼ 0x2afc
  • Counting rod tens digit three: 𝍫 0x1d36b
  • Suzhou numerals 〢 〣
  • Chinese river 川
  • ║ BOX DRAWINGS DOUBLE VERTICAL...


来源:https://stackoverflow.com/questions/49079499/unicode-letters-with-more-than-1-alphabetic-latin-character

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!