Sorting words according to letters of an old Semitic language

纵然是瞬间 提交于 2019-12-11 02:19:32

问题


I use XSLT 3.0, Saxon-PE 9.7.

I need to sort orth according to the Ugaritic language, close to Hebrew but with additional characters.

I have tried:

 <xsl:sort select="orth" data-type="text" order="ascending" lang="uga"/>

But the proposed order is not correct. So I think I need to describe the Ugaritic alphabetic order. How can I do?

In advance, thank you very much.


回答1:


Saxon allows you to define your own collation in its configuration file, you basically have to set up a configuration file with a section like

 <collations>
      <collation uri="http://example.com/uga-trans"
      rules="&lt; ʾa &lt; b &lt; g &lt; ḫ &lt; d &lt; h &lt; w &lt; z &lt; ḥ &lt; ṭ &lt; y &lt; k &lt; š &lt; l &lt; m &lt; ḏ &lt; n &lt; ẓ &lt; s &lt; ʿ &lt; p &lt; ṣ &lt; q &lt; r &lt; ṯ &lt; ġ &lt; t &lt; ʾi &lt; ʾu &lt; s2"/>
 </collations>

where the uri attribute defines a URI as the name for your collation that you can then use in the collation attribute of an xsl:sort:

            <xsl:perform-sort select="$input-seq">
                <xsl:sort select="string()" collation="http://example.com/uga-trans"/>
            </xsl:perform-sort> 

The syntax to be used in the rules attribute is the one defined for the Java class RuleBasedCollator https://docs.oracle.com/javase/7/docs/api/java/text/RuleBasedCollator.html, it has an example there for Norwegian. The only caveat is that the Java syntax is plain text while the Saxon configuration is XML so the < to define the ordering has to be escaped in the rules attribute as &lt;.

I have set up above a rule based on the transcription sequence presented in the Wikipedia article https://en.wikipedia.org/wiki/Ugaritic_alphabet. Whether that is the one you are looking for I am not sure.

You can run Saxon from the command line with -config:yourconfiguationfile.xml to use such a configuration, oXygen has a field in the Saxon specific transformation scenario dialog to select a configuration file.




回答2:


Im not sure if this will be the best solution, but thats the one I know.

The code you are searching for is:

      <xsl:sort select="((orth='character1') * 1) + ((orth='character2') * 2) + ((orth='character3') * 3) ..." data-type="text" order="ascending"/>

You need to do this for every character of the alphabet. The lower the multiplication, the earlier it appears in the result. Basically you are defining your own order for specified values.



来源:https://stackoverflow.com/questions/48412851/sorting-words-according-to-letters-of-an-old-semitic-language

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!