I have the following Python array of dictionaries:
myarr = [ { \'name\': \'Richard\', \'rank\': 1 },
{ \'name\': \'Reuben\', \'rank\': 4 },
{ \'name\': \'Reece\'
Option 1:
key=lambda d:(d['rank']==0, d['rank'])
Option 2:
key=lambda d:d['rank'] if d['rank']!=0 else float('inf')
Demo:
"I'd like to sort it by the rank values, ordering as follows: 1-2-3-4-0-0-0." --original poster
>>> sorted([0,0,0,1,2,3,4], key=lambda x:(x==0, x))
[1, 2, 3, 4, 0, 0]
>>> sorted([0,0,0,1,2,3,4], key=lambda x:x if x!=0 else float('inf'))
[1, 2, 3, 4, 0, 0]
Additional comments:
"Please could you explain to me (a Python novice) what it's doing? I can see that it's a lambda, which I know is an anonymous function: what's the bit in brackets?" – OP comment
Indexing/slice notation:
itemgetter('rank')
is the same thing as lambda x: x['rank']
is the same thing as the function:
def getRank(myDict):
return myDict['rank']
The [...]
is called the indexing/slice notation, see Explain Python's slice notation - Also note that someArray[n]
is common notation in many programming languages for indexing, but may not support slices of the form [start:end]
or [start:end:step]
.
key=
vs cmp=
vs rich comparison:
As for what is going on, there are two common ways to specify how a sorting algorithm works: one is with a key
function, and the other is with a cmp
function (now deprecated in python, but a lot more versatile). While a cmp
function allows you to arbitrarily specify how two elements should compare (input: a
,b
; output: a or
a>b
or a==b
). Though legitimate, it gives us no major benefit (we'd have to duplicate code in an awkward manner), and a key function is more natural for your case. (See "object rich comparison" for how to implicitly define cmp=
in an elegant but possibly-excessive way.)
Implementing your key function:
Unfortunately 0 is an element of the integers and thus has a natural ordering: 0 is normally < 1,2,3... Thus if we want to impose an extra rule, we need to sort the list at a "higher level". We do this by making the key a tuple: tuples are sorted first by their 1st element, then by their 2nd element. True will always be ordered after False, so all the Trues will be ordered after the Falses; they will then sort as normal: (True,1)<(True,2)<(True,3)<...
, (False,1)<(False,2)<...
, (False,*)<(True,*)
. The alternative (option 2), merely assigns rank-0 dictionaries a value of infinity, since that is guaranteed to be above any possible rank.
More general alternative - object rich comparison:
The even more general solution would be to create a class representing records, then implement __lt__
, __gt__
, __eq__
, __ne__
, __gt__
, __ge__
, and all the other rich comparison operators, or alternatively just implement one of those and __eq__
and use the @functools.total_ordering decorator. This will cause objects of that class to use the custom logic whenever you use comparison operators (e.g. x=Record(name='Joe', rank=12)
y=Record(...)
x
sorted(...)
function uses <
and other comparison operators by default in a comparison sort, this will make the behavior automatic when sorting, and in other instances where you use <
and other comparison operators. This may or may not be excessive depending on your use case.
Cleaner alternative - don't overload 0 with semantics:
I should however point out that it's a bit artificial to put 0s behind 1,2,3,4,etc. Whether this is justified depends on whether rank=0 really means rank=0; if rank=0 are really "lower" than rank=1 (which in turn are really "lower" than rank=2...). If this is truly the case, then your method is perfectly fine. If this is not the case, then you might consider omitting the 'rank':...
entry as opposed to setting 'rank':0
. Then you could sort by Lev Levitsky's answer using 'rank' in d
, or by:
Option 1 with different scheme:
key=lambda d: (not 'rank' in d, d['rank'])
Option 2 with different scheme:
key=lambda d: d.get('rank', float('inf'))
sidenote: Relying on the existence of infinity in python is almost borderline a hack, making any of the mentioned solutions (tuples, object comparison), Lev's filter-then-concatenate solution, and even maybe the slightly-more-complicated cmp solution (typed up by wilson), more generalizable to other languages.