Are there any scripts, libraries, or programs using Python
, or BASH
tools (e.g. awk
, perl
, sed
) which can correc
The cjklib library does cover your needs:
Either use the Python shell:
>>> from cjklib.reading import ReadingFactory
>>> f = ReadingFactory()
>>> print f.convert('Bei3jing1', 'Pinyin', 'Pinyin', sourceOptions={'toneMarkType': 'numbers'})
Běijīng
Or just the command line:
$ cjknife -m Bei3jing1
Běijīng
Disclaimer: I developed that library.