Algorithms: Hybrid MergeSort and InsertionSort Execution Time

随声附和 提交于 2019-12-05 04:06:55

The example code isn't a conventional merge sort. The merge function is shifting an array instead of merging runs between the original array and a temporary working array and back.

I tested top down and bottom up merge sorts and both take about 42 ms == 0.042 seconds to sort 500,000 32 bit integers, versus the apparent results in the graph which are 1000 times slower at about 42 seconds instead of 42 ms. I also tested with 10,000,000 integers and it takes a bit over 1 second to sort.

In the past, using C++, I compared a bottom up merge sort with a hybrid bottom up merge / insertion sort, and for 16 million (2^24 == 16,777,216) 32 bit integers, the hybrid sort was about 8% faster with S == 16. S == 64 was slightly slower than S == 16. Visual Studio std::stable_sort is a variation of bottom up merge sort (the temp array is 1/2 the size of the original array) and insertion sort, and uses S == 32.

For small arrays, insertion sort is quicker than merge sort, a combination of cache locality and fewer instructions needed to sort a small array with insertion sort. For pseudo random data and S == 16 to 64, insertion sort was about twice as fast as merge sort.

The relative gain diminishes as the array size increases. Considering the effect on bottom up merge sort, with S == 16, only 4 merge passes are optimized. In my test case with 2^24 == 16,777,216 elements, that's 4/24 = 1/6 ~= 16.7% of the number of passes, resulting in about an 8% improvement (so the insertion sort is about twice as fast as merge sort for those 4 passes). The total times were about 1.52 seconds for the merge only sort, and about 1.40 seconds for the hybrid sort, a 0.12 second gain on a process that only takes 1.52 seconds. For a top down merge sort, with S == 16, the 4 deepest levels of recursion would be optimized.

Update - Example java code for an hybrid in place merge sort / insertion sort with O(n log(n)) time complexity. (Note - auxiliary storage is still consumed on the stack due to recursion.) The in place part is accomplished during merge steps by swapping the data in the area merged into with the data in the area merged from. This is not a stable sort (the order of equal elements is not preserved, due to the swapping during merge steps). Sorting 500,000 integers takes about 1/8th of a second, so I increased this to 16 million (2^24 == 16777216) integers, which takes a bit over 4 seconds. Without the insertion sort, the sort takes about 4.524 seconds, and with the insertion sort with S == 64, the sort takes about 4.150 seconds, about 8.8% gain. With essentially the same code in C, the improvement was less: from 2.88 seconds to 2.75 seconds, about 4.5% gain.

package msortih;
import java.util.Random;

public class msortih {

    static final int S = 64;    // use insertion sort if size <= S

    static void swap(int[] a, int i, int j) {
        int tmp = a[i]; a[i] = a[j]; a[j] = tmp;

    // a[w:] = merged a[i:m]+a[j:n]
    // a[i:] = reordered a[w:]
    static void wmerge(int[] a, int i, int m, int j, int n, int w) {
        while (i < m && j < n)
            swap(a, w++, a[i] < a[j] ? i++ : j++);
        while (i < m)
            swap(a, w++, i++);
        while (j < n)
            swap(a, w++, j++);

    // a[w:]  = sorted    a[b:e]
    // a[b:e] = reordered a[w:]
    static void wsort(int[] a, int b, int e, int w) {
        int m;
        if (e - b > 1) {
            m = b + (e - b) / 2;
            imsort(a, b, m);
            imsort(a, m, e);
            wmerge(a, b, m, m, e, w);
            while (b < e)
                swap(a, b++, w++);

    // inplace merge sort a[b:e]
    static void imsort(int[] a, int b, int e) {
        int m, n, w, x;
        int t;
        // if <= S elements, use insertion sort
        if (e - b <= S){
            for(n = b+1; n < e; n++){
               t = a[n];
               m = n-1;
                while(m >= b && a[m] > t){
                    a[m+1] = a[m];
                a[m+1] = t;}
        if (e - b > 1) {
            // split a[b:e]
            m = b + (e - b) / 2;
            w = b + e - m;
            // wsort -> a[w:e] = sorted    a[b:m]
            //          a[b:m] = reordered a[w:e]
            wsort(a, b, m, w);
            while (w - b > 2) {
                // split a[b:w], w = new mid point
                n = w;
                w = b + (n - b + 1) / 2;
                x = b + n - w;
                // wsort -> a[b:x] = sorted    a[w:n]
                //          a[w:n] = reordered a[b:x]
                wsort(a, w, n, b);
                // wmerge -> a[w:e] = merged    a[b:x]+a[n:e]
                //           a[b:x] = reordered a[w:n]
                wmerge(a, b, x, n, e, w);
            // insert a[b:w] into a[b:e] using left shift
            for (n = w; n > b; --n) {
                t = a[n-1];
                for (m = n; m < e && a[m] < t; ++m)
                    a[m-1] = a[m];
                a[m-1] = t;

    public static void main(String[] args) {
        int[] a = new int[16*1024*1024];
        Random r = new Random(0);
        for(int i = 0; i < a.length; i++)
            a[i] = r.nextInt();
        long bgn, end;
        bgn = System.currentTimeMillis();
        imsort(a, 0, a.length);
        end = System.currentTimeMillis();
        for(int i = 1; i < a.length; i++){
            if(a[i-1] > a[i]){
        System.out.println("milliseconds " + (end-bgn));