In my application, I need to sort large arrays (between 100,000 and 1,000,000) of random numbers.
I\'ve been using the built in array.sort(comparisonFunction)<
There are sort implementations that consistently beat the stock .sort
(V8 at least), node-timsort being one of them. Example:
var SIZE = 1 << 20;
var a = [], b = [];
for(var i = 0; i < SIZE; i++) {
var r = (Math.random() * 10000) >>> 0;
a.push(r);
b.push(r);
}
console.log(navigator.userAgent);
console.time("timsort");
timsort.sort(a, (x, y) => x - y);
console.timeEnd("timsort");
console.time("Array#sort");
b.sort((x, y) => x - y);
console.timeEnd("Array#sort");
<script src="https://rawgithub.com/mziccard/node-timsort/master/build/timsort.js"></script>
Here are some timings from different browsers I have around (Chakra anyone?):
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.113 Safari/537.36
timsort: 256.120ms
Array#sort: 341.595ms
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/602.2.14 (KHTML, like Gecko) Version/10.0.1 Safari/602.2.14
timsort: 189.795ms
Array#sort: 245.725ms
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:51.0) Gecko/20100101 Firefox/51.0
timsort: 402.230ms
Array#sort: 187.900ms
So, the FF engine is very different from Chrome/Safari.
No need to mark this as an answer, since it's not javascript, and doesn't have introsort's depth check to switch to heapsort.
Example C++ quicksort. It uses median of 3 to choose pivot value, Hoare partition scheme, then excludes middle values == pivot (which may or may not improve time, as values == pivot can end up anywhere during partition step), and only uses recursion on the smaller partition, looping back on the larger partition to limit stack complexity to O(log2(n)) worst case. The worst case time complexity is still O(n^2), but this would require median of 3 to repeatedly choose small or large values, an unusual pattern. Sorted, or reverse sorted arrays are not an issue. If all values are the same, then time complexity is O(n). Adding a depth check to switch to heapsort (making this an introsort) would limit time complexity to O(n log(n)), but with a higher constant factor depending on how much heapsort path is used.
void QuickSort(uint32_t a[], size_t lo, size_t hi) {
while(lo < hi){
size_t i = lo, j = (lo+hi)/2, k = hi;
uint32_t p;
if (a[k] < a[i]) // median of 3
std::swap(a[k], a[i]);
if (a[j] < a[i])
std::swap(a[j], a[i]);
if (a[k] < a[j])
std::swap(a[k], a[j]);
p = a[j];
i--; // Hoare partition
k++;
while (1) {
while (a[++i] < p);
while (a[--k] > p);
if (i >= k)
break;
std::swap(a[i], a[k]);
}
i = k++;
while(i > lo && a[i] == p) // exclude middle values == pivot
i--;
while(k < hi && a[k] == p)
k++;
// recurse on smaller part, loop on larger part
if((i - lo) <= (hi - k)){
QuickSort(a, lo, i);
lo = k;
} else {
QuickSort(a, k, hi);
hi = i;
}
}
}
If space isn't an issue, then the merge sorts here may be better:
Native JavaScript sort performing slower than implemented mergesort and quicksort
If just sorting numbers, and again assuming space isn't an issue, radix sort would be fastest.