Remove duplicates key from list of dictionaries python

后端未结

关注

 4  557

I am trying to remove the duplicates from following list

 distinct_cur = [{\'rtc\': 0, \'vf\': 0, \'mtc\': 0, \'doc\': \'good job\', \'foc\': 195, \'st\': 0


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  没有蜡笔的小新        
                
              
                            
                2021-01-05 15:00
              
            
            
                                                                       
You're creating a set out of different elements and expect that it will remove the duplicates based on a criterion that only you know.

You have to iterate through your list, and add to the result list only if doc has a different value than the previous ones:
for instance like this:

done = set()
result = []
for d in distinct_cur:
    if d['doc'] not in done:
        done.add(d['doc'])  # note it down for further iterations
        result.append(d)


that will keep only the first occurrence(s) of the dictionaries which have the same doc key by registering the known keys in an aux set.

Another possibility is to use a dictionary with the key as the "doc" key of the dictionary, iterating backwards in the list so the first items overwrite the last ones in the list:

result = {i['doc']:i for i in reversed(distinct_cur)}.values()

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  陌清茗        
                
              
                            
                2021-01-05 15:09
              
            
            
                                                                       
I see 2 similar solutions that depend on your domain problem: do you want to keep the first instance of a key or the last instance?

Using the last (so as to overwrite the previous matches) is simpler:

d = {r['doc']: r for r in distinct_cur}.values()

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  遇见更好的自我        
                
              
                            
                2021-01-05 15:09
              
            
            
                                                                       
Try this:

distinct_cur  =[dict(t) for t in set([tuple(d.items()) for d in distinct_cur])]


Worked for me...
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  别那么骄傲        
                
              
                            
                2021-01-05 15:10
              
            
            
                                                                       
One liner to deduplicate the list of dictionaries distinct_cur on the primary_key of doc

[i for n, i in enumerate(distinct_cur) if i.get('doc') not in [y.get('doc') for y in distinct_cur[n + 1:]]]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复