问题
There are a few ways to get the list of all Unicode characters' names: for example using Python module unicodedata, as explained in List of unicode character names, or using the website: https://unicode.org/charts/charindex.html but here it's incomplete, and you have to open and parse PDF to find the names.
But what is the official source / repository of all Unicode character names? (such that if a new character is added, the list is updated, so I'm looking for the initial source for these names, in a machine readable format).
I'm looking for a list with just code point
and name
, in CSV or any other format:
code character name
...
0102 LATIN CAPITAL LETTER A WITH BREVE
0103 LATIN SMALL LETTER A WITH BREVE
...
回答1:
The official source for the actual character data (which includes the character names and many, many other details) is the Unicode Character Database.
The latest version of the data files can be accessed via http://www.unicode.org/Public/UCD/latest/.
Names specifically can be found in the files NamesList.txt. The format of that file is described here.
This is the list in CSV format: https://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt
来源:https://stackoverflow.com/questions/65158620/official-repository-of-unicode-character-names