I have a collections.defaultdict(int) that I\'m building to keep count of how many times a key shows up in a set of data. I later want to be able to sort it (obviously by turnin
Note: I'm putting this in as an answer so that it gets seen. I don't want upvotes. If you want to upvote anyone, upvote Nadia.
The currently accepted answer gives timing results which are based on a trivially small dataset (size == 6 - (-5) == 11). The differences in cost of the various methods are masked by the overhead. A use case like what are the most frequent words in a text or most frequent names in a membership list or census involves much larger datasets.
Repeating the experiment with range(-n,n+1) (Windows box, Python 2.6.4, all times in microseconds):
n=5: 11.5, 9.34, 11.3
n=50: 65.5, 46.2, 68.1
n=500: 612, 423, 614
These results are NOT "slightly" different. The itemgetter answer is a clear winner on speed.
There was also mention of "the simplicity of the get
idiom". Putting them close together for ease of comparison:
[(k, adict[k]) for k in sorted(adict, key=adict.get, reverse=True)]
sorted(adict.iteritems(), key=itemgetter(1), reverse=True)
The get
idiom not only looks up the dict twice (as J. F. Sebastian has pointed out), it makes one list (result of sorted()
) then iterates over that list to create a result list. I'd call that baroque, not simple. YMMV.