C++ Implementing Heap Median Function

核能气质少年 提交于 2019-12-21 02:42:10

问题


Following the answer found here, https://stackoverflow.com/a/10931091/1311773, I am attempting to implement two heaps so I can calculate a running median.

I'm unfamiliar with heaps, and am not sure where to begin implementing this function described here. http://programmingpraxis.com/2012/05/29/streaming-median/

My goal is to create a small test program that calculates running medians efficiently, so as the list grows the median doesn't need to be recalculated from scratch. Using two heaps, I should be able to do it, I'm just shaky on how to start implementing it.

Any advice on this would be appreciated.


回答1:


The std::priority_queue template provides all the properties of a heap. Constant time maximum or minimum extraction (depending on how the items are compared), and logarithmic time insertion. It can be found in the <queue> header file.

By default, the priority_queue is a max-heap.

int numbers[11] = { 0, 9, 3, 4, 8, 12, 2, 11, 10, 1, 5 };
std::priority_queue<int> myheap(numbers, numbers + 11);
std::cout << "biggest " << myheap.top() << std::endl; // 12
myheap.pop();
std::cout << "biggest " << myheap.top() << std::endl; // 11
myheap.push(6);
std::cout << "biggest " << myheap.top() << std::endl; // 11
myheap.push(13);
std::cout << "biggest " << myheap.top() << std::endl; // 13

Here is an example of how to create a min-heap:

std::priority_queue<
    int,
    std::priority_queue<int>::container_type,
    std::greater<int>
>;

To implement the streaming median algorithm, the approach is similar to this:

  • create a max-heap for items that are smaller than the median
  • create a min-heap for items that are greater than the median
  • when pushing new items into the heap
    • decide which heap it should be pushed into, and push it there
    • if one of the heaps' size is more than 1 greater than the other heap, then pop the bigger one, and put that element into the smaller one

Then, the median is the either the top of the larger heap, or the average of the tops of both heaps.

If you feel you need to manage the heap manually, C++ provides algorithms that allow you to do so on your own random access data structure.

  • std::make_heap -- heapify a region specified by iterator endpoints
  • std::push_heap -- assumes the first N-1 elements are already a heap, and incoporates the N'th element into the heap
  • std::pop_heap -- places the first element in the region into the last position, and reheapifies the region, but leaving the last element alone



回答2:


I guess this will help. Thanks.

    #include<cstdio>
    #include<iostream>
    #include<queue>
    #include <vector>         
    #include <functional>

    typedef priority_queue<unsigned int> type_H_low;
    typedef priority_queue<unsigned int, std::vector<unsigned int>, std::greater<unsigned int> > type_H_high;

    size_t signum(int left, int right) {
        if (left == right){
            return 0;
        }
        return (left < right)?-1:1;
    }

    void get_median( unsigned int x_i, unsigned int &m, type_H_low *l, type_H_high *r) {

        switch (signum( l->size(), r->size() )) {
        case 1: // There are more elements in left (max) heap
            if (x_i < m) {
                r->push(l->top());
                l->pop();
                l->push(x_i);
            } else {
                r->push(x_i);
            }
            break;

        case 0: // The left and right heaps contain same number of elements
            if (x_i < m){
                l->push(x_i);
            } else {
                r->push(x_i);
            }
            break;

        case -1: // There are more elements in right (min) heap
            if (x_i < m){
                l->push(x_i);
            } else {
                l->push(r->top());
                r->pop();
                r->push(x_i);
            }
            break;
        }

        if (l->size() == r->size()){
            m = l->top();
        } else if (l->size() > r->size()){
            m = l->top();
        } else {
            m = r->top();
        }

        return;
    }

    void print_median(vector<unsigned int> v) {
        unsigned int median = 0;
        long int sum = 0;
        type_H_low  H_low;
        type_H_high H_high;

        for (vector<unsigned int>::iterator x_i = v.begin(); x_i != v.end(); x_i++) {
            get_median(*x_i, median, &H_low, &H_high);
            std::cout << median << std::endl;
        }
    }


来源:https://stackoverflow.com/questions/10966298/c-implementing-heap-median-function

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!