问题
In this question: https://www.quora.com/What-is-randomized-quicksort
Alejo Hausner told in: Cost of quicksort, in the worst case, that
Ironically, if you apply quicksort to an array that is already sorted, you will get probably get this costly behavior
I cannot get it. Can someone explain it to me.
https://www.quora.com/What-will-be-the-complexity-of-quick-sort-if-array-is-already-sorted may be answer to this, but that did not get me a complete response.
回答1:
The Quicksort algorithm is this:
- select a pivot
- move elements smaller than the pivot to the beginning, and elements larger than pivot to the end
- now the array looks like
[<=p, <=p, <=p, p, >p, >p, >p]
- recursively sort the first and second "halves" of the array
Quicksort will be efficient, with a running time close to n log n
, if the pivot always end up close to the middle of the array. This works perfectly if the pivot is the median value. But selecting the actual median would be costly in itself. If the pivot happens, out of bad luck, to be the smallest or largest element in the array, you'll get an array like this: [p, >p, >p, >p, >p, >p, >p]
. If this happens too often, your "quicksort" effectively behaves like selection sort. In that case, since the size of the subarray to be recursively sorted only reduces by 1 at every iteration, there will be n
levels of iteration, each one costing n
operations, so the overall complexity will be `n^2.
Now, since we're not willing to use costly operations to find a good pivot, we might as well pick an element at random. And since we also don't really care about any kind of true randomness, we can just pick an arbitrary element from the array, for instance the first one.
If the array was shuffled uniformly at random, then picking the first element is great. You can reasonably hope it will regularly give you an "average" element. But if the array was already sorted... Then by definition the first element is the smallest. So we're in the bad case where the complexity is n^2
.
A simple way to avoid "bad lists" is to pick a true random element instead of an arbitrary element. Or if you have reasons to believe that quicksort will often be called on lists that are almost sorted, you could pick the element in position n/2
instead of the one in position 1.
There are also several research papers about different ways to select the pivot, with precise calculations on the impact on complexity. For instance, you could pick three random elements, rank them from smallest to largest and keep the middle one. But the conclusion usually is: if you try to write a better pivot-selection, then it will also be more costly, and the overall complexity of the algorithm won't be improved that much.
回答2:
Depending on the implementations there are several 'common' ways to choose the pivot.
In general for 'unsorted' source there is no good or bad way to choose it. So some implementations just take the first element as pivot.
In the case of a already sorted source this results in the worst pivot possible because the lest interval will always be empty.
-> recursion steps = O(n) instead the desired O(log n).
This leads to O(n²) complexity, which is very bad for sorting.
Choosing the pivot by random avoids this behavior. It is extremely unlikely that the random chosen pivot will have the same bad characteristics in every recursion as described above.
Also on purpose bad source is not possible to generate because you cannot predict the choices of the random generator (if it's a good one)
来源:https://stackoverflow.com/questions/63686324/quicksort-to-already-sorted-array