问题
I use XSLT 3.0, Saxon-PE 9.7.
I need to sort orth
according to the Ugaritic language, close to Hebrew but with additional characters.
I have tried:
<xsl:sort select="orth" data-type="text" order="ascending" lang="uga"/>
But the proposed order is not correct. So I think I need to describe the Ugaritic alphabetic order. How can I do?
In advance, thank you very much.
回答1:
Saxon allows you to define your own collation in its configuration file, you basically have to set up a configuration file with a section like
<collations>
<collation uri="http://example.com/uga-trans"
rules="< ʾa < b < g < ḫ < d < h < w < z < ḥ < ṭ < y < k < š < l < m < ḏ < n < ẓ < s < ʿ < p < ṣ < q < r < ṯ < ġ < t < ʾi < ʾu < s2"/>
</collations>
where the uri
attribute defines a URI as the name for your collation that you can then use in the collation
attribute of an xsl:sort
:
<xsl:perform-sort select="$input-seq">
<xsl:sort select="string()" collation="http://example.com/uga-trans"/>
</xsl:perform-sort>
The syntax to be used in the rules
attribute is the one defined for the Java class RuleBasedCollator
https://docs.oracle.com/javase/7/docs/api/java/text/RuleBasedCollator.html, it has an example there for Norwegian. The only caveat is that the Java syntax is plain text while the Saxon configuration is XML so the <
to define the ordering has to be escaped in the rules
attribute as <
.
I have set up above a rule based on the transcription sequence presented in the Wikipedia article https://en.wikipedia.org/wiki/Ugaritic_alphabet. Whether that is the one you are looking for I am not sure.
You can run Saxon from the command line with -config:yourconfiguationfile.xml
to use such a configuration, oXygen has a field in the Saxon specific transformation scenario dialog to select a configuration file.
回答2:
Im not sure if this will be the best solution, but thats the one I know.
The code you are searching for is:
<xsl:sort select="((orth='character1') * 1) + ((orth='character2') * 2) + ((orth='character3') * 3) ..." data-type="text" order="ascending"/>
You need to do this for every character of the alphabet. The lower the multiplication, the earlier it appears in the result. Basically you are defining your own order for specified values.
来源:https://stackoverflow.com/questions/48412851/sorting-words-according-to-letters-of-an-old-semitic-language