Python: sort an array of dictionaries with custom comparator?

后端未结

关注

 8  1236

I have the following Python array of dictionaries:

myarr = [ { \'name\': \'Richard\', \'rank\': 1 },
{ \'name\': \'Reuben\', \'rank\': 4 },
{ \'name\': \'Reece\'


                      
              相关标签:


      
      
        
          8条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  灰色年华        
                
              
                            
                2021-02-08 10:46
              
            
            
                                                                       
try 
sorted_master_list = sorted(myarr, key=itemgetter('rank'), reverse=True)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  刺人心        
                
              
                            
                2021-02-08 10:49
              
            
            
                                                                       
Just give pass to "key" an arbitrary function or callable object - 
it is what it takes. itemgetter happens to be one such function -- but it can work
with any function you write - it just has to take a single parameter as input, and return
an object that is directly compable to  achieve the order you want.

In this case:

def key_func(item):
   return item["rank"] if item["rank"] != 0 else -100000

sorted_master_list = sorted(myarr, key=key_func)


(it can also be written as a lambda expression)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  时光取名叫无心        
                
              
                            
                2021-02-08 10:55
              
            
            
                                                                       
I'm more leaning toward creating a compare function to handle the "0" specifically:

def compare(x,y):
    if x == y:
        return 0
    elif x == 0:
        return 1
    elif y == 0:
        return -1
    else:
        return cmp(x,y)

sorted(myarr, cmp=lambda x,y: compare(x,y), key=lambda x:x['rank'])


However, there are performance penalty on the custom compare function.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  南方客        
                
              
                            
                2021-02-08 10:56
              
            
            
                                                                       
Option 1:

key=lambda d:(d['rank']==0, d['rank'])


Option 2:

key=lambda d:d['rank'] if d['rank']!=0 else float('inf')


Demo:


  "I'd like to sort it by the rank values, ordering as follows: 1-2-3-4-0-0-0." --original poster


>>> sorted([0,0,0,1,2,3,4], key=lambda x:(x==0, x))
[1, 2, 3, 4, 0, 0]

>>> sorted([0,0,0,1,2,3,4], key=lambda x:x if x!=0 else float('inf'))
[1, 2, 3, 4, 0, 0]




 

Additional comments:


  "Please could you explain to me (a Python novice) what it's doing? I can see that it's a lambda, which I know is an anonymous function: what's the bit in brackets?" – OP comment


Indexing/slice notation:

itemgetter('rank') is the same thing as lambda x: x['rank'] is the same thing as the function:

def getRank(myDict):
    return myDict['rank']


The [...] is called the indexing/slice notation, see Explain Python's slice notation - Also note that someArray[n] is common notation in many programming languages for indexing, but may not support slices of the form [start:end] or [start:end:step].

key= vs cmp= vs rich comparison:

As for what is going on, there are two common ways to specify how a sorting algorithm works: one is with a key function, and the other is with a cmp function (now deprecated in python, but a lot more versatile). While a cmp function allows you to arbitrarily specify how two elements should compare (input: a,b; output: a<b or a>b or a==b). Though legitimate, it gives us no major benefit (we'd have to duplicate code in an awkward manner), and a key function is more natural for your case. (See "object rich comparison" for how to implicitly define cmp= in an elegant but possibly-excessive way.)

Implementing your key function:

Unfortunately 0 is an element of the integers and thus has a natural ordering: 0 is normally < 1,2,3... Thus if we want to impose an extra rule, we need to sort the list at a "higher level". We do this by making the key a tuple: tuples are sorted first by their 1st element, then by their 2nd element. True will always be ordered after False, so all the Trues will be ordered after the Falses; they will then sort as normal: (True,1)<(True,2)<(True,3)<..., (False,1)<(False,2)<..., (False,*)<(True,*). The alternative (option 2), merely assigns rank-0 dictionaries a value of infinity, since that is guaranteed to be above any possible rank.

More general alternative - object rich comparison: 

The even more general solution would be to create a class representing records, then implement __lt__, __gt__, __eq__, __ne__, __gt__, __ge__, and all the other rich comparison operators, or alternatively just implement one of those and __eq__ and use the @functools.total_ordering decorator. This will cause objects of that class to use the custom logic whenever you use comparison operators (e.g. x=Record(name='Joe', rank=12) y=Record(...) x<y); since the sorted(...) function uses < and other comparison operators by default in a comparison sort, this will make the behavior automatic when sorting, and in other instances where you use < and other comparison operators. This may or may not be excessive depending on your use case.

Cleaner alternative - don't overload 0 with semantics:

I should however point out that it's a bit artificial to put 0s behind 1,2,3,4,etc. Whether this is justified depends on whether rank=0 really means rank=0; if rank=0 are really "lower" than rank=1 (which in turn are really "lower" than rank=2...). If this is truly the case, then your method is perfectly fine. If this is not the case, then you might consider omitting the 'rank':... entry as opposed to setting 'rank':0. Then you could sort by Lev Levitsky's answer using 'rank' in d, or by:

Option 1 with different scheme:

key=lambda d: (not 'rank' in d, d['rank'])


Option 2 with different scheme:

key=lambda d: d.get('rank', float('inf'))


sidenote: Relying on the existence of infinity in python is almost borderline a hack, making any of the mentioned solutions (tuples, object comparison), Lev's filter-then-concatenate solution, and even maybe the slightly-more-complicated cmp solution (typed up by wilson), more generalizable to other languages.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  梦谈多话        
                
              
                            
                2021-02-08 11:00
              
            
            
                                                                       
You can use function in key param:

for ass sorting:

sorted_master_list = sorted(myarr, key=lambda x: x.get('rank'))


or to desc:

sorted_master_list = sorted(myarr, key=lambda x: -x.get('rank'))


Also you can read about sorted function here http://wiki.python.org/moin/HowTo/Sorting
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一整个雨季        
                
              
                            
                2021-02-08 11:06
              
            
            
                                                                       
A hacky way to do it is:

sorted_master_list = sorted(myarr, key=lambda x: 99999 if x['rank'] == 0 else x['rank'])


This works fairly well if you know your maximum rank.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复