elegant way to remove all elements of a vector that are contained in another vector?

后端未结

关注

 5  781

While looking over some code I found loopy and algorithmically slow implementation of std::set_difference :

 for(int i = 0; i < a.size(); i++)
 {
  iter = std


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  我在风中等你        
                
              
                            
                2021-02-14 00:23
              
            
            
                                                                       
This one does it in O(N+M), assuming both arrays are sorted.

  auto ib = std::begin(two);
  auto iter = std::remove_if (
       std::begin(one), std::end(one),
       [&ib](int x) -> bool {
                       while  (ib != std::end(two) && *ib < x) ++ib;
                       return (ib != std::end(two) && *ib == x);
                     });

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  天命终不由人        
                
              
                            
                2021-02-14 00:30
              
            
            
                                                                       
One solution that comes to mind is combining remove_if and binary_search. It's effectively the same as your manual looping solution but might be a bit more "elegant" as it uses more STL features.

sort(begin(b), end(b));
auto iter = remove_if(begin(a), end(a), 
                      [](auto x) { 
                          return binary_search(begin(b), end(b), x); 
                      });
// Now [begin(a), iter) defines a new range, and you can erase them however
// you see fit, based on the type of a.

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  深忆病人        
                
              
                            
                2021-02-14 00:40
              
            
            
                                                                       
The current code is quite clear, in that it should be obvious to any programmer what's going on.

The current performance is O(a.size() * b.size()), which may be pretty bad depending upon the actual sizes.

A more concise and STL-like way to describe it is to use remove_if with a predicate that tells you if a value in in a.

b.erase(std::remove_if(b.begin(), b.end(), [](const auto&x) {
  return std::find(a.begin(), a.end(), x) != a.end();
}), b.end());


(Not tested, so I might have made a syntax error.)  I used a lambda, but you can create a functor if you're not using a C++11 compiler.

Note that the original code removes just one instance of a value in b that's also in a.  My solution will remove all instances of such a value from b.

Note that the find operation happens again and again, so it's probably better to do that on the smaller vector for better locality of reference.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  不知归路        
                
              
                            
                2021-02-14 00:47
              
            
            
                                                                       
Sort b so you can binary search it in order to reduce time complexity. Then use the erase-remove idiom in order to throw away all elements from a that are contained in b:

sort( begin(b), end(b) );
a.erase( remove_if( begin(a),end(a),
    [&](auto x){return binary_search(begin(b),end(b),x);}), end(a) );


Of course, you can still sacrifice time complexity for simplicity and reduce your code by removing the sort() and replacing binary_search() by find():

a.erase( remove_if( begin(a),end(a),
    [&](auto x){return find(begin(b),end(b),x)!=end(b);}), end(a) );


This is a matter of taste. In both cases you don't need heap allocations. By the way, I'm using lambda auto parameters which are C++14. Some compilers already implement that feature such as clang. If you don't have such a compiler, but only C++11 then replace auto by the element type of the container. 

By the way, this code does not mention any types! You can write a template function so it works for all kind of types. The first variant requires random access iteration of b while the second piece of code does not require that. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  夕颜        
                
              
                            
                2021-02-14 00:50
              
            
            
                                                                       
after thinking for a while I thought of this

(note:by answering my own Q im not claiming this is a superior to offered A):

vector<int64_t> a{3,2,7,5,11,13}, b{2,3,13,5};
set<int64_t> bs(b.begin(), b.end());
for (const auto& num: bs)
    cout << num << " ";
cout  << endl;
for (const auto& num: a)
    bs.erase(num);
vector<int64_t> result(bs.begin(), bs.end());
for (const auto& num: result)
    cout << num << " ";

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复