What is the most time efficient way to remove unordered duplicates in a 2D array?

前端未结

关注

 3  1061

忘了有多久 2021-01-12 14:08

I\'ve generated a list of combinations, using itertools and I\'m getting a result that looks like this:

nums = [-5,5,4,-3,0,0,4,-2]
x = [x for x


      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   南笙
                                             
                
                
                (楼主)
            
              
              
                2021-01-12 14:18
              

            
            
                        
The best way is to not generate the duplicates in the first place.

The idea is to first create all possible combinations of values that appear multiple times, where each appears 0, 1, ... times. Then, we complete them with all possible combinations of the unique elements.

from itertools import combinations, product, chain
from collections import Counter

nums = [-5,5,4,-3,0,0,4,-2]

def combinations_without_duplicates(nums, k):
    counts = Counter(nums)
    multiples = {val: count for val, count in counts.items() if count >= 2 }
    uniques = set(counts) - set(multiples)              
    possible_multiples = [[[val]*i for i in range(count+1)] for val, count in multiples.items()]
    multiples_part = (list(chain(*x)) for x in product(*possible_multiples))
    # omit the ones that are too long
    multiples_part = (lst for lst in multiples_part if len(lst) <= k)
    # Would be at this point:
    # [[], [0], [0, 0], [4], [4, 0], [4, 0, 0], [4, 4], [4, 4, 0], [4, 4, 0, 0]]
    for m_part in multiples_part:
        missing = k - len(m_part)
        for c in combinations(uniques, missing):
            yield m_part + list(c)


list(combinations_without_duplicates(nums, 4))


Output:

[[-3, -5, 5, -2],
 [0, -3, -5, 5],
 [0, -3, -5, -2],
 [0, -3, 5, -2],
 [0, -5, 5, -2],
 [0, 0, -3, -5],
 [0, 0, -3, 5],
 [0, 0, -3, -2],
 [0, 0, -5, 5],
 [0, 0, -5, -2],
 [0, 0, 5, -2],
 [4, -3, -5, 5],
 [4, -3, -5, -2],
 [4, -3, 5, -2],
 [4, -5, 5, -2],
 [4, 0, -3, -5],
 [4, 0, -3, 5],
 [4, 0, -3, -2],
 [4, 0, -5, 5],
 [4, 0, -5, -2],
 [4, 0, 5, -2],
 [4, 0, 0, -3],
 [4, 0, 0, -5],
 [4, 0, 0, 5],
 [4, 0, 0, -2],
 [4, 4, -3, -5],
 [4, 4, -3, 5],
 [4, 4, -3, -2],
 [4, 4, -5, 5],
 [4, 4, -5, -2],
 [4, 4, 5, -2],
 [4, 4, 0, -3],
 [4, 4, 0, -5],
 [4, 4, 0, 5],
 [4, 4, 0, -2],
 [4, 4, 0, 0]]

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复