After reading Guido\'s Sorting a million 32-bit integers in 2MB of RAM using Python, I discovered the heapq
module, but the concept is pretty abstract to me.
The heapq module is commonly use to implement priority queues.
You see priority queues in event schedulers that are constantly adding new events and need to use a heap to efficiently locate the next scheduled event. Some examples include:
The heapq docs include priority queue implementation notes which address the common use cases.
In addition, heaps are great for implementing partial sorts. For example, heapq.nsmallest and heapq.nlargest can be much more memory efficient and do many fewer comparisons than a full sort followed by a slice:
>>> from heapq import nlargest
>>> from random import random
>>> nlargest(5, (random() for i in xrange(1000000)))
[0.9999995650034837, 0.9999985756262746, 0.9999971934450994, 0.9999960394998497, 0.9999949126363714]