问题
The Language Analysis framework is deprecated and its not even available in 64-bit. The documentation says - use CFStringTokenizer but the tokenizer doesn't provide functionalities available in lang analysis framework.
What is the replacement for morpheme analysis APIs that lang analysis framework provided?
EDIT: Though Pantong's reply helped but it doesn't work in all cases, e.g. for words with 3-4 kanji characters it returns incorrect result. (By incorrect I mean its not same as what it returned by Lang analysis framework API for same string).
a) 現人神 is converted to latin - 'gen ren shen' and in hiragana- 'げんじんしん' whereas it should be - in latin - 'Arahitogami ' and in hiragana- 'あらひとがみ'
b) 安本丹 is converted to latin - 'an ben dan' and in hiragana- 'やすもとまこと' whereas it should be - in latin as - 'Yasumoto makoto ' and in hiragana- 'あんぽんたん'
回答1:
One feature the deprecated morpheme analysis APIs has is "getting rudy text for Japanese/Chinese text". If you asking the replacement for that particular feature, then the following code is an example. However, I don't know about the replacement for other features in morpheme analysis APIs.
CFStringRef testString = CFSTR("のちに検知されたトークンの範囲用として使用");
CFStringTokenizerRef tokenizer = CFStringTokenizerCreate(kCFAllocatorDefault,
testString,
CFRangeMake(0, CFStringGetLength(testString)),
kCFStringTokenizerUnitWordBoundary,
CFLocaleCreate(kCFAllocatorDefault, CFSTR("Japanese")));
do
{
if (CFStringTokenizerAdvanceToNextToken(tokenizer) == kCFStringTokenizerTokenNone) {
break;
}
CFStringRef originalToken = CFStringCreateWithSubstring(kCFAllocatorDefault,
testString,
CFStringTokenizerGetCurrentTokenRange(tokenizer));
// Get Latin transcription from the Japanese text
CFMutableStringRef convertedToken = (CFMutableStringRef)CFStringTokenizerCopyCurrentTokenAttribute(tokenizer,
kCFStringTokenizerAttributeLatinTranscription);
NSLog(@"token: %@ -> latin: %@", originalToken, convertedToken);
// Get kana from Latin transcription
CFStringTransform(convertedToken, NULL, kCFStringTransformLatinHiragana, false);
NSLog(@"token: %@ -> latin: %@", originalToken, convertedToken);
}
while (true);
来源:https://stackoverflow.com/questions/15339643/what-is-the-replacement-for-language-analysis-frameworks-morpheme-analysis-depr