elegant way to remove all elements of a vector that are contained in another vector?

后端 未结 5 768
Happy的楠姐
Happy的楠姐 2021-02-14 00:08

While looking over some code I found loopy and algorithmically slow implementation of std::set_difference :

 for(int i = 0; i < a.size(); i++)
 {
  iter = std         


        
相关标签:
5条回答
  • 2021-02-14 00:23

    This one does it in O(N+M), assuming both arrays are sorted.

      auto ib = std::begin(two);
      auto iter = std::remove_if (
           std::begin(one), std::end(one),
           [&ib](int x) -> bool {
                           while  (ib != std::end(two) && *ib < x) ++ib;
                           return (ib != std::end(two) && *ib == x);
                         });
    
    0 讨论(0)
  • 2021-02-14 00:30

    One solution that comes to mind is combining remove_if and binary_search. It's effectively the same as your manual looping solution but might be a bit more "elegant" as it uses more STL features.

    sort(begin(b), end(b));
    auto iter = remove_if(begin(a), end(a), 
                          [](auto x) { 
                              return binary_search(begin(b), end(b), x); 
                          });
    // Now [begin(a), iter) defines a new range, and you can erase them however
    // you see fit, based on the type of a.
    
    0 讨论(0)
  • 2021-02-14 00:40

    The current code is quite clear, in that it should be obvious to any programmer what's going on.

    The current performance is O(a.size() * b.size()), which may be pretty bad depending upon the actual sizes.

    A more concise and STL-like way to describe it is to use remove_if with a predicate that tells you if a value in in a.

    b.erase(std::remove_if(b.begin(), b.end(), [](const auto&x) {
      return std::find(a.begin(), a.end(), x) != a.end();
    }), b.end());
    

    (Not tested, so I might have made a syntax error.) I used a lambda, but you can create a functor if you're not using a C++11 compiler.

    Note that the original code removes just one instance of a value in b that's also in a. My solution will remove all instances of such a value from b.

    Note that the find operation happens again and again, so it's probably better to do that on the smaller vector for better locality of reference.

    0 讨论(0)
  • 2021-02-14 00:47

    Sort b so you can binary search it in order to reduce time complexity. Then use the erase-remove idiom in order to throw away all elements from a that are contained in b:

    sort( begin(b), end(b) );
    a.erase( remove_if( begin(a),end(a),
        [&](auto x){return binary_search(begin(b),end(b),x);}), end(a) );
    

    Of course, you can still sacrifice time complexity for simplicity and reduce your code by removing the sort() and replacing binary_search() by find():

    a.erase( remove_if( begin(a),end(a),
        [&](auto x){return find(begin(b),end(b),x)!=end(b);}), end(a) );
    

    This is a matter of taste. In both cases you don't need heap allocations. By the way, I'm using lambda auto parameters which are C++14. Some compilers already implement that feature such as clang. If you don't have such a compiler, but only C++11 then replace auto by the element type of the container.

    By the way, this code does not mention any types! You can write a template function so it works for all kind of types. The first variant requires random access iteration of b while the second piece of code does not require that.

    0 讨论(0)
  • 2021-02-14 00:50

    after thinking for a while I thought of this
    (note:by answering my own Q im not claiming this is a superior to offered A):

    vector<int64_t> a{3,2,7,5,11,13}, b{2,3,13,5};
    set<int64_t> bs(b.begin(), b.end());
    for (const auto& num: bs)
        cout << num << " ";
    cout  << endl;
    for (const auto& num: a)
        bs.erase(num);
    vector<int64_t> result(bs.begin(), bs.end());
    for (const auto& num: result)
        cout << num << " ";
    
    0 讨论(0)
提交回复
热议问题