How to use distribution sort (radix sort, etc) to sort strings?

僤鯓⒐⒋嵵緔 提交于 2020-01-14 10:28:12

问题


I know how to use radix sort to sort integers.

But how to use it to sort strings? or float numbers?


回答1:


Radix sort or any other distribution sort may be used to sort floating point numbers if you ignore some peculiarities of them like infinity, not-a-number values and two different representations of zero. IEEE 754-2008 floating point numbers have binary representations, compatible in sorting order with integer numbers. So, if you exclude not-a-numbers and reinterpret float or double as int32 or int64, you can directly apply any distribution sort to them. Edit: Negative floating point numbers need special treatment (as pointed out by AShelly), because their sorting order is opposite to the sorting order of integer numbers.

With strings, it is more difficult because of their variable length. Other kind of distribution sort (bucket sort) may be used and is often used for strings. Several starting characters of the string are used for bucket indexing, then any comparative sort is used to sort strings inside the buckets.

If all strings have almost equal length and/or some technique is used to amplify differences between strings (like described in chapter 6 of "FAST: Fast Architecture Sensitive Tree Search on Modern CPUs and GPUs"), then radix sort may be used as well: split the string to groups of characters (or better, to groups of bits) of equal length, reinterpret these groups as integers, and continue as if it is radix sort for integers.

Edit: All kinds of distribution sort are guaranteed to work properly only for ASCII strings. Other string encodings may require different sort order or may depend on the "collate" parameter of the locale.




回答2:


Yes it is possible.

See Radix Sort, Sorting a float data for floats. It uses the fact that floats cast to integer types compare correctly (once negatives are corrected for). See this article for details

For strings, you can solve the variable-length problem by doing an MSD radix sort and ensuring that you stop descending when you encounter Nulls. See Radix sort implemented in c++ for string.



来源:https://stackoverflow.com/questions/9638602/how-to-use-distribution-sort-radix-sort-etc-to-sort-strings

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!