Localization of singular/plural words - what are the different language rules for grammatical numbers?

前端 未结 3 638
粉色の甜心
粉色の甜心 2021-02-05 18:39

I have been developing a .NET string formatting library to assist with localization of an application. It\'s called SmartFormat and is open-source on GitHub.

相关标签:
3条回答
  • 2021-02-05 18:48

    The approach you have taken might work on most cases in English and Spanish but most likely fails on many other languages. The problem is that you only have one pattern that tries to solve all grammatical numbers.

    var message = "There {0:is|are} {0} {0:item|items} remaining";
    

    You need one pattern for each grammatical gender. Here I have combined two patterns together into a single multi pattern string.

    var message = PluralFormat("one;There is {0} item remaining;other;There are {0} items remaining", count);
    

    English uses two grammatical number: singular and plural. one starts singular pattern and other starts plural pattern.

    When translated for example to Finnish that uses the same amount of grammatical numbers you would use

    "one;{0} kappale jäljellä;other;{0} kappaletta jäljellä"
    

    However Japanese use only one grammatical number so Japanese would only use other. Polish uses three grammatical numbers so it would contains one, few and many.

    Secondly you would need the proper rules to choose the right pattern amount multiple patterns. Unicode consortium's CLDR contains the rules in XML file.

    I have implemented an open source library that uses CLDR rules (converted from XML into C# code and included into the library) and multi patterns strings to support both grammatical numbers and grammatical genders.

    https://github.com/jaska45/I18N

    Using this library your samples turns into

    var message = MultiPattern.Format("one;There is {0} item remaining;other;There are {0} items remaining", count);
    
    0 讨论(0)
  • 2021-02-05 19:06

    Definitely, different languages have different pluralization rules. Especially interesting could be Arabic and Polish both of which contain quite a few plural forms.

    If you want to learn more about these rules, please visit Unicode Common Locale Data Repository, namely Language Plural Rules.

    There are quite a few interesting information there, unfortunately some of them are unfortunately wrong. I hope plural forms are correct (at least for Polish they are, as far as I could tell :) ).

    0 讨论(0)
  • 2021-02-05 19:09

    It would be nice if you provided in the question body a sample of the rules that you're using, what format do they take?

    Anyway, in your example:

    var message = "There {0:is:are} {0} {0:item:items} remaining";
    

    you seem to be basing on the assumption that the selection in both choice segments is based on the same single rule, and that there is direct correspondence between the two choices - that is the same single rule would choose (is,item) or (are,items).

    This assumption is not necessarily correct for other languages, take for example the fictitious language English-ez (just to make things easier to understand for the reader, I find examples in foreign languages irritating - I'm borrowing from Arabic but simplifying a lot). The rules for this language are as follows:

    The first selection segment is the same as normal English:

    is: count=1
    are: count=0, count=2..infinity
    

    The second selection segment has a different rule from normal English, assume the following simple rule:

    item: count=1
    item-da: count=2 # this language has a special dual form.
    items: count=0, count=3..infinity 
    

    Now the single rule solution would not be adequate - we can suggest a different form:

    var message = "There {0:is:are@rule1} {0} {0:item:items@rule2} remaining";
    

    This solution might have problems in other situations, but we are discussing the example you provided.

    Check gettext (allows selection of full message to a single level) and ICU (allows selection of full message to multiple levels ie on multiple variables).

    0 讨论(0)
提交回复
热议问题