Efficient set intersection of a collection of sets in C++

醉酒当歌 提交于 2019-12-03 13:14:53

You might want to try a generalization of std::set_intersection(): the algorithm is to use iterators for all sets:

  1. If any iterator has reached the end() of its corresponding set, you are done. Thus, it can be assumed that all iterators are valid.
  2. Take the first iterator's value as the next candidate value x.
  3. Move through the list of iterators and std::find_if() the first element at least as big as x.
  4. If the value is bigger than x make it the new candidate value and search again in the sequence of iterators.
  5. If all iterators are on value x you found an element of the intersection: Record it, increment all iterators, start over.

Night is a good adviser and I think I may have an idea ;)

  • Memory is much slower than CPU these days, if all data fits in the L1 cache no big deal, but it easily spills over to L2 or L3: 5 sets of 1000 elements is already 5000 elements, meaning 5000 nodes, and a set node contains at least 3 pointers + the object (ie, at least 16 bytes on a 32 bits machine and 32 bytes on a 64 bits machine) => that's at least 80k memory and the recent CPUs only have 32k for the L1D so we are already spilling into L2
  • The previous fact is compounded by the problem that sets nodes are probably scattered around memory, and not tightly packed together, meaning that part of the cache line is filled with completely unrelated stuff. This could be alleviated by provided an allocator that keeps nodes close to each others.
  • And this is further compounded by the fact that CPUs are much better at sequential reads (where they can prefetch memory before you need it, so you don't wait for it) rather than random reads (and a tree structure unfortunately leads to quite random reads)

This is why where speeds matter, a vector (or perhaps a deque) are so great structures: they play very well with memory. As such, I would definitely recommend using vector as our intermediary structures; although care need be taken to only ever insert/delete from an extremity to avoid relocation.

So I thought about a rather simple approach:

#include <cassert>

#include <algorithm>
#include <set>
#include <vector>

// Do not call this method if you have a single set...
// And the pointers better not be null either!
std::vector<int> intersect(std::vector< std::set<int> const* > const& sets) {
    for (auto s: sets) { assert(s && "I said no null pointer"); }

    std::vector<int> result; // only return this one, for NRVO to kick in

    // 0. Check obvious cases
    if (sets.empty()) { return result; }

    if (sets.size() == 1) {
        result.assign(sets.front()->begin(), sets.front()->end());
        return result;
    }


    // 1. Merge first two sets in the result
    std::set_intersection(sets[0]->begin(), sets[0]->end(),
                          sets[1]->begin(), sets[1]->end(),
                          std::back_inserter(result));

    if (sets.size() == 2) { return result; }


    // 2. Merge consecutive sets with result into buffer, then swap them around
    //    so that the "result" is always in result at the end of the loop.

    std::vector<int> buffer; // outside the loop so that we reuse its memory

    for (size_t i = 2; i < sets.size(); ++i) {
        buffer.clear();

        std::set_intersection(result.begin(), result.end(),
                              sets[i]->begin(), sets[i]->end(),
                              std::back_inserter(buffer));

        swap(result, buffer);
    }

    return result;
}

It seems correct, I cannot guarantee its speed though, obviously.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!