Alphabetize Arabic and Japanese text that is in Unicode?

前端未结

关注

 5  1239

Does anyone have any code for alphabetizing Arabic and Japanese text that is in Unicode? If the code was in ruby that would be great.

相关标签:

5条回答

生来不讨喜

2021-01-03 03:06

To ask the obvious question, what don't you like about mylist.sort?

0 讨论(0)
发布评论:

提交评论
- 加载中...
礼貌的吻别

2021-01-03 03:12

mylist.sort should work out of the box in Ruby 1.9 (which has built-in unicode support). In Ruby 1.8, where Unicode support isn't built in, I think you'd have to use the character-encodings gem extend the String class with UTF-8 string comparisions. (And then mylist.sort would work.)

0 讨论(0)
发布评论:

提交评论
- 加载中...
栀梦

2021-01-03 03:23
I don't know Ruby, but python has a function, ord() that translates a unicode special character to its unicode code point. For example,
```
>>> a = u'ل'
>>> ord(a)
0: 1604
>>> b = u'ع'
>>> ord(b)
1: 1593
```
Look for something like that in Ruby. I assume that the Arabic symbols are listed in unicode in alphabetic order.
0 讨论(0)
发布评论:

提交评论
- 加载中...
长发绾君心

2021-01-03 03:24

Unicode code points are not listed in alphabetic order (Z < a, for example), but they try to be approximately in that order anyway. There is a canonical unicode order, defined by the Unicode Collation Algorithm and they are also language-specific ordering (french order is not exacly the same as german or czech order, even with the same alphabet), which can be specified in locale information. I think the ICU library contains the language specific algorithms you are looking for.

0 讨论(0)
发布评论:

提交评论
- 加载中...
小鲜肉

2021-01-03 03:30

Depending on your needs words.sort in ruby will be fine for Japanese. The order the characters appear in Unicode are in a reasonably good sorting order. Can't vouch for Arabic though, but my guess is that it's ok as well.

0 讨论(0)
发布评论:

提交评论
- 加载中...