问题
I've found places on the web such as http://www.chinesetopinyin.com/ that convert Chinese characters to pinyin (romanization). Does anyone know how to do this, or have a database that can be parsed?
EDIT: I'm using C# but would actually prefer a database/flatfile.
回答1:
possible solution using Python:
I think that Unicode database contains pinyin romanizations for chinese characters, but these are not included in unicodedata
module data.
however, you can use some external libraries, like cjklib, example:
# coding: UTF-8
import cjklib
from cjklib.characterlookup import CharacterLookup
c = u'好'
cjk = CharacterLookup('T')
readings = cjk.getReadingForCharacter(c, 'Pinyin')
for r in readings:
print r
output:
hāo
hǎo
hào
UPDATE
cjklib comes with an standalone cjknife
utility, which micht help. some usage is described here
回答2:
If you use java, you can use pinyin4j.
http://pinyin4j.sourceforge.net/
回答3:
Okay, first I used my question here to get the unicode:
Converting chinese character to Unicode
Then took a file like this to convert it: http://www.ic.unicamp.br/~stolfi/voynich/Notes/061/uc-to-py.tbl
回答4:
Yes, it's easy. Use Google Translate instead. It always shows both Chinese characters and pinyin as well... That's a BIG shortcoming of MS (or Bing) translators.
Most non-Chinese people need to have pinyin available if they wish to stand any chance of pronouncing Chinese correctly while "in the field" (in a Chinese speaking environment.)
Again, the solution is simple... use Google Translate instead!
来源:https://stackoverflow.com/questions/3571480/converting-chinese-to-pinyin