Given a list of tuples where each tuple consists of a probability and an item I\'d like to sample an item according to its probability. For example, give the list [ (.3, \'a\'),
I reckon the multinomial function is a still fairly easy way to get samples of a distribution in random order. This is just one way
import numpy
from itertools import izip
def getSamples(input, size):
probabilities, items = zip(*input)
sampleCounts = numpy.random.multinomial(size, probabilities)
samples = numpy.array(tuple(countsToSamples(sampleCounts, items)))
numpy.random.shuffle(samples)
return samples
def countsToSamples(counts, items):
for value, repeats in izip(items, counts):
for _i in xrange(repeats):
yield value
Where inputs is as specified [(.2, 'a'), (.4, 'b'), (.3, 'c')]
and size is the number of samples you need.