What's faster: inserting into a priority queue, or sorting retrospectively?

前端 未结 10 1434
闹比i
闹比i 2020-12-12 16:58

What\'s faster: inserting into a priority queue, or sorting retrospectively?

I am generating some items that I need to be sorted at the end. I was wondering, what is

相关标签:
10条回答
  • 2020-12-12 17:15

    On a max-insert priority queue operations are O(lg n)

    0 讨论(0)
  • 2020-12-12 17:20

    To your first question (which is faster): it depends. Just test it. Assuming you want the final result in a vector, the alternatives might look something like this:

    #include <iostream>
    #include <vector>
    #include <queue>
    #include <cstdlib>
    #include <functional>
    #include <algorithm>
    #include <iterator>
    
    #ifndef NUM
        #define NUM 10
    #endif
    
    int main() {
        std::srand(1038749);
        std::vector<int> res;
    
        #ifdef USE_VECTOR
            for (int i = 0; i < NUM; ++i) {
                res.push_back(std::rand());
            }
            std::sort(res.begin(), res.end(), std::greater<int>());
        #else
            std::priority_queue<int> q;
            for (int i = 0; i < NUM; ++i) {
                q.push(std::rand());
            }
            res.resize(q.size());
            for (int i = 0; i < NUM; ++i) {
                res[i] = q.top();
                q.pop();
            }
        #endif
        #if NUM <= 10
            std::copy(res.begin(), res.end(), std::ostream_iterator<int>(std::cout,"\n"));
        #endif
    }
    
    $ g++     sortspeed.cpp   -o sortspeed -DNUM=10000000 && time ./sortspeed
    
    real    0m20.719s
    user    0m20.561s
    sys     0m0.077s
    
    $ g++     sortspeed.cpp   -o sortspeed -DUSE_VECTOR -DNUM=10000000 && time ./sortspeed
    
    real    0m5.828s
    user    0m5.733s
    sys     0m0.108s
    

    So, std::sort beats std::priority_queue, in this case. But maybe you have a better or worse std:sort, and maybe you have a better or worse implementation of a heap. Or if not better or worse, just more or less suited to your exact usage, which is different from my invented usage: "create a sorted vector containing the values".

    I can say with a lot of confidence that random data won't hit the worst case of std::sort, so in a sense this test might flatter it. But for a good implementation of std::sort, its worst case will be very difficult to construct, and might not actually be all that bad anyway.

    Edit: I added use of a multiset, since some people have suggested a tree:

        #elif defined(USE_SET)
            std::multiset<int,std::greater<int> > s;
            for (int i = 0; i < NUM; ++i) {
                s.insert(std::rand());
            }
            res.resize(s.size());
            int j = 0;
            for (std::multiset<int>::iterator i = s.begin(); i != s.end(); ++i, ++j) {
                res[j] = *i;
            }
        #else
    
    $ g++     sortspeed.cpp   -o sortspeed -DUSE_SET -DNUM=10000000 && time ./sortspeed
    
    real    0m26.656s
    user    0m26.530s
    sys     0m0.062s
    

    To your second question (complexity): they're all O(n log n), ignoring fiddly implementation details like whether memory allocation is O(1) or not (vector::push_back and other forms of insert at the end are amortized O(1)) and assuming that by "sort" you mean a comparison sort. Other kinds of sort can have lower complexity.

    0 讨论(0)
  • 2020-12-12 17:22

    There are a lot of great answers to this question. A reasonable "rule of thumb" is

    • If you have all your elements "up front" then choose sorting.
    • If you will be adding elements / removing minimal elements "on the fly" then use a priority queue (e.g., heap).

    For the first case, the best "worst-case" sort is heap sort anyway and you'll often get better cache performance by just focusing on sorting (i.e. instead of interleaving with other operations).

    0 讨论(0)
  • 2020-12-12 17:25

    As far as I understand, your problem does not require Priority Queue, since your tasks sounds like "Make many insertions, after that sort everything". That's like shooting birds from a laser, not an appropriate tool. Use standard sorting techniques for that.

    You would need a Priority Queue, if your task was to imitate a sequence of operations, where each operation can be either "Add an element to the set" or "Remove smallest/greatest element from the set". This can be used in problem of finding a shortest path on the graph, for example. Here you cannot just use standard sorting techniques.

    0 讨论(0)
提交回复
热议问题