Localization: How to map culture info to a script name or Unicode character range?

后端 未结 5 1407
温柔的废话
温柔的废话 2020-12-21 15:50

I need some information about localization. I am using .net 2.0 with C# 2.0 which takes care of most of the localization related issues. However, I need to manually draw the

相关标签:
5条回答
  • 2020-12-21 16:24

    Fascinating topic. While it might not answer your question, Omniglot is a good resource.

    The correct answer is likely to be complex, and depend on the exact problem you're solving. Assuming your goal showing only letters used in a particular language to separate phonebook sections (as in Outlook), few of the issues are:

    • People who have contact names spanning several scripts/languages.
    • 2-glyph letters (e.g. 'Lj' in Serbian). It is one phoneme, always treated as a single letter although it has 2 Unicode symbols. 'It would have its own section in the phonebook (separate from 'L').
    • Too many glyphs to list (e.g. Chinese)
    • Unorthodox ordering (e.g. Thai -- a phone book would be separated by consonants only, ignoring the vowels).
    • Uppercase / lowercase distinction (presumably you'd only want one case for languages that support it -- which breaks down in minor ways Turkish 'i').
    0 讨论(0)
  • 2020-12-21 16:29

    In native code there's LOCALE_SSCRIPTS for GetLocaleInfoEx() (Vista & above) that shows you what scripts are expected for a locale. There isn't a similar concept for .Net at this time.

    0 讨论(0)
  • 2020-12-21 16:29

    I fully agree with mikiemacman. In addition, a given laguage doesn't necessarily uses all the letters of a script.

    Anyway, the closest I can think of is CultureInfo.TextInfo.ANSICodePage -> There are only a handful of ANSI code pages. You could have create a table (or a switch() statement, whatever) that lists the script for each ANSI codepage.

    0 讨论(0)
  • 2020-12-21 16:44

    Chinese has thousands of characters, so it might not be feasible to show all the characters in their character set. There's no native concept of 'alphabet' in Chinese, and I don't think Chinese has a syllabary like Japanese does.

    Pinyin (Chinese written in roman alphabet) can be used to represent the Chinese characters, and that might help you index them. I know this doesn't answer your question, but I hope it's helpful.

    0 讨论(0)
  • 2020-12-21 16:46

    Proto, wait! There's a much more accurate solution. It's an unmanaged on hance you may have to P/Invoke.

    GetLocaleInfoW(MAKELCID(wLangId, SORT_DEFAULT), LOCALE_FONTSIGNATURE, wcBuf, MAXWCBUF);
    

    This gives you a LOCALESIGNATURE stucture. The anwer is in the lsUsb field: Unicode subsets bitfield. Rats! the MS page for this structure is empty. But look it up in your MSDN copy. It's fully documented there: A whole set of flags that describe which scripts are spported. And yes, there's a flag for Tamil ;-)

    HTH.

    EDIT: Oops! Hadn't seen Shawne's answer. Wow! Answer from an in-house expert! ;-) Anyway, you may still be interested in a Pre-Vista compatible answer.

    0 讨论(0)
提交回复
热议问题