I\'ve been researching around, and while I\'ve figured out the general idea of using Radix Sort to alphabetize an array of strings, I know I\'m going the wrong direction.
The slides you've found are great! But where did those queues come from in your code?
Anyway, here you are (live example):
template
size_t bin(const E& elem, size_t digit)
{
return elem.size() > digit ? size_t(elem[digit]) + 1 : 0;
}
template
void radix_sort(P& pos, const C& data, size_t digit)
{
using A = std::array;
A count = {};
P prev(pos);
for (auto i : prev)
++count[bin(data[i], digit)];
A done = {}, offset = {{0}};
std::partial_sum(count.begin(), count.end() - 1, offset.begin() + 1);
for (auto i : prev)
{
size_t b = bin(data[i], digit);
pos[offset[b] + done[b]++] = i;
}
}
struct shorter
{
template
bool operator()(const A& a, const A& b) { return a.size() < b.size(); }
};
template
std::vector radix_sort(const C& data)
{
std::vector pos(data.size());
std::iota(pos.begin(), pos.end(), 0);
size_t width = std::max_element(data.begin(), data.end(), shorter())->size();
for (long digit = long(width) - 1; digit >= 0; --digit)
radix_sort(pos, data, size_t(digit));
return pos;
}
which you can use like that
int main()
{
std::vector data = generate();
std::vector pos = radix_sort<128>(data);
for (auto i : pos)
std::cout << data[i] << std::endl;
}
where generate()
is included in the live example and generates the strings given in your question.
I am not trying to explain how this works here, I assume you can figure out since you are working on the problem. But a few comments are in order.
We are neither sorting the input sequence in-place, nor returning a sorted copy; we are just returning a sequence of positions of input elements in the sorted sequence.
We are processing strings from right to left.
The complexity is O(lw)
where l
is the input length (number of input strings) and w
is the maximum input width (max. length of all input strings). So this algorithm makes sense if the string width does not vary too much.
The first template parameter R
of radix_sort()
is the number of possible values for each digit (letter) in the input. E.g. R = 128
means that possible values are 0..127
. This should be fine for your input. I haven't tried to do anything clever with respect to ASCII codes, but you can customize function bin()
for that.
In the output of bin()
, value 0
is reserved to mean "we are past the end of this string". Such strings are placed before others that are still continuing.
I have tried to give self-explanatory names to variables and functions, and use standard library calls for common tasks where possible.
The code is generic, e.g. it can sort any random access container containing random access containers, not just vectors of strings.
I am using C++11 features here and there for convenience, but nothing is really necessary: one could easily do the same just with C++03.