问题
I'm trying to write OpenMP solution for Insertion sort but I'm having problems to make it run in parallel and give correct results :). Is there any way to make Insertion sort it run in parallel.
Here is my code:
void insertionsort(int *A, int num)
{
// clock_t start, stop;
//
// start=clock();
int k;
#pragma omp parallel for shared(A) private(k)
for(int n = 1; n < num; n++)
{
int key = A[n];
k = n;
#pragma omp critical
for(;k>0 && A[k-1]> key;k--)
{
A[k] = A[k-1];
}
A[k] = key;
}
// stop=clock();
// cas = (double)(stop-start)/CLOCKS_PER_SEC;
}
回答1:
You can not parallelize the Insertion Sort algorithm this way. As you can see from the inner loop condition A[k-1]> key;
, this algorithm is assuming that for a given key
in the position k
of the array, if the actual key is bigger than the keys stored on the previous position of the array the swap
should stop. Therefore, the algorithm is assuming that the keys on the positions below k
are already sorted.
When you introduce parallelization, with two thread, for example, thread 0 will start from the beginning of the array, and thread 1 will start from the half. The problem is that the first half is not sorted, according to the assumption made by the algorithm, thus this will lead to problems.
Let me give you an example, sorting an array = [-1,2,-3,4,-5,6,-7,8]
with 2 threads:
Let's fix a given execution ordered (in reality is non-deterministic)
- 1) Thread 0 takes k = 1 and key = 2; array status
[-1,2,-3,4,-5,6,-7,8]
- 2) Thread 1 takes k = 5 and key = 6; array status
[-1,2,-3,4,-5,6,-7,8]
- 3) Thread 0 takes k = 2 and key = -3; array status
[-3,-1,2,4,-5,6,-7,8]
- 4) Thread 1 takes k = 6 and key = -7; array status
[-7,-3,-1,2,4,-5,6,8]
- 5) Thread 0 takes k = 3 and key = 2; array status
[-7,-3,-1,2,4,-5,6,8]
- 6) Thread 1 takes k = 7 and key = 8; array status
[-7,-3,-1,2,4,-5,6,8]
- 7) Thread 0 takes k = 4 and key = 4; array status
[-7,-3,-1,2,4,-5,6,8]
Final result : [-7,-3,-1,2,4,-5,6,8]
On line 4 thread 1 takes the key -7
from position 6
and puts at the end of the array sifting all the elements from positions 1 to 6
(included) one position to the right, so now -5
is on the old position of -7
. Since, the old position of -7 (6) will never be compared again -5
will stay there untouchable. Hence, making the algorithm not sorted.
An easy but a poor solution would be to add the OpenMP ordered
clause to the parallel for
construct. But, using this would make your code basically a sequential one.
Another possible solution, although I am not 100% sure it can fit on your case, would be making your algorithm parallel by Regular Sampling. You can see here an example of this latter technique apply on quicksort
.
The struct of your algorithm is not the best one to be parallelize directly and achieve speedup it. Since each iteration of the inner loop is interdependent, this will required the use of methods to unsure mutual exclusion, thus resulting in overhead. You have far better sorting algorithm that you can directly parallelize, typically the ones that use an divide and conquer strategy such as Radix Sort or Quick Sort among others.
来源:https://stackoverflow.com/questions/13905410/insertion-sort-in-openmp