Should sorting algorithm pass same element in the comparison function

不羁岁月 提交于 2019-12-10 13:31:45

问题


The std::sort of libcxx (llvm version of c++ standard library) calls the comparison predicate with the same element i.e., both the arguments of comparison functor refer to the same position in the sequence to be sorted. A reduced example to illustrate the point.

$ cat a.cc
#include <algorithm>
#include <vector>
#include <cassert>

int main(int argc, char** argv) {
  int size = 100;
  std::vector<int> v(size);

  // Elements in v are unique.
  for (int i = 0; i < size; ++i)
    v[i] = i;

  std::sort(v.begin(), v.end(),
            [&](int x, int y) { assert(x != y); return x < y; });

  return 0;
}

$ clang++ -std=c++11 -stdlib=libc++ a.cc -o a.out
$ ./a.out
a.out: a.cc:14: auto main(int, char **)::(anonymous class)::operator()(int, int) const: Assertion `x != y' failed.
./go.sh: line 5: 19447 Aborted                 (core dumped) ./a.out

Works fine with libstdc++.

$ clang++ -std=c++11 -stdlib=libstdc++ a.cc -o a.out
$ ./a.out

Is this okay to call the comparison function with same element. Isn't this redundant.


回答1:


I can speak with some authority on this question as I'm the one who wrote this code.

Here is the comparison which is asserting in your example:

https://github.com/llvm-mirror/libcxx/blob/master/include/algorithm#L3994-L3995

As the link is likely to go stale (point to the wrong line) as time goes on, I'll also quote the code here:

            // __m still guards upward moving __i
            while (__comp(*__i, *__m))
                ++__i;

This is known as an "unguarded" loop because there is no check for the iterator __i running off the end of the sequence as it is incremented. The reason this works is because an invariant of this algorithm is that at this point it is known that __i <= __m (which is also in a comment 3 lines above this quote).

If you look at the code above this quote, you will see these comments:

    // The search going up is known to be guarded but the search coming down isn't.
    // Prime the downward search with a guard.

So before we get to this point, a guarded search down the sequence is done. That is, this test:

if (__i == --__j)

After this test finds the lower guard, the algorithm then jumps to the unguarded loops which have only one test per iteration, whereas otherwise there would be two tests per iteration (a test on the iterator and a test on the dereferenced value of the iterator).

The use of "unguarded loops" is the cause of an element being compared with itself. During development I measured that the cost of the one extra comparison in the loop was a better deal than including two comparisons per iteration in the loop.

Of course this is an engineering tradeoff. If the comparison function turned out to be horribly expensive compared to the cost of comparing the iterators themselves, one could come to a different conclusion.

This answer is completely consistent with rici's answer which is also correct (and I've upvoted it). I added my voice as I could drop the "presumably" and point to specific lines of code in the algorithm.




回答2:


Presumably, in the opinion of the standard-library authors, it is faster to do a test which is guaranteed to return false than to constantly check for equal indices as well as comparing elements. This might come about because the pivot element is used as a loop sentinel.

It is certainly permitted for the comparison function to be called in this way, and the assert in your code is not permitted.



来源:https://stackoverflow.com/questions/38966516/should-sorting-algorithm-pass-same-element-in-the-comparison-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!