How do I “randomly” select numbers with a specified bias toward a particular number

问题

How do I generate random numbers with a specified bias toward one number. For example, how would I pick between two numbers, 1 and 2, with a 90% bias toward 1. The best I can come up with is...

import random

print random.choice([1, 1, 1, 1, 1, 1, 1, 1, 1, 2])

Is there a better way to do this? The method I showed works in simple examples but eventually I'll have to do more complicated selections with biases that are very specific (such as 37.65% bias) which would require a very long list.

EDIT: I should have added that I'm stuck on numpy 1.6 so I can't use numpy.random.choice.

回答1:

np.random.choice has a p parameter which you can use to specify the probability of the choices:

np.random.choice([1,2], p=[0.9, 0.1])

回答2:

The algorithm used by np.random.choice() is relatively simple to replicate if you only need to draw one item at a time.

import numpy as np

def simple_weighted_choice(choices, weights, prng=np.random):
    running_sum = np.cumsum(weights)
    u = prng.uniform(0.0, running_sum[-1])
    i = np.searchsorted(running_sum, u, side='left')
    return choices[i]

回答3:

For random sampling with replacement, the essential code in np.random.choice is

            cdf = p.cumsum()
            cdf /= cdf[-1]
            uniform_samples = self.random_sample(shape)
            idx = cdf.searchsorted(uniform_samples, side='right')

So we can use that in a new function the does the same thing (but without error checking and other niceties):

import numpy as np


def weighted_choice(values, p, size=1):
    values = np.asarray(values)

    cdf = np.asarray(p).cumsum()
    cdf /= cdf[-1]

    uniform_samples = np.random.random_sample(size)
    idx = cdf.searchsorted(uniform_samples, side='right')
    sample = values[idx]

    return sample

Examples:

In [113]: weighted_choice([1, 2], [0.9, 0.1], 20)
Out[113]: array([1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1])

In [114]: weighted_choice(['cat', 'dog', 'goldfish'], [0.3, 0.6, 0.1], 15)
Out[114]: 
array(['cat', 'dog', 'cat', 'dog', 'dog', 'dog', 'dog', 'dog', 'dog',
       'dog', 'dog', 'dog', 'goldfish', 'dog', 'dog'], 
      dtype='|S8')

回答4:

Something like that should do the trick, and working with all floating point probability without creating a intermediate array.

import random
from itertools import accumulate  # for python 3.x

def accumulate(l):  # for python 2.x
    tmp = 0
    for n in l:
        tmp += n
        yield tmp

def random_choice(a, p):
    sums = sum(p)
    accum = accumulate(p)  # made a cumulative list of probability
    accum = [n / sums for n in accum]  # normalize
    rnd = random.random()
    for i, item in enumerate(accum):
        if rnd < item:
            return a[i]

回答5:

Easy to get is the index in probability table. Make a table for as many weights as you need looking for example like this: prb = [0.5, 0.65, 0.8, 1]

Get index with something like this:

 def get_in_range(prb, pointer):
    """Returns index of matching range in table prb"""
    found = 0
    for p in prb:
        if nr>p:
            found += 1
    return found

Index returned by get_in_range may be used to point in corresponding table of values.

Example usage:

import random
values = [1, 2, 3]
weights = [0.9, 0.99, 1]
result = values[get_in_range(prb, random.random())]

There should be probability of choosing 1 with 95%; 2 with 4% and 3 with 1%

来源：https://stackoverflow.com/questions/25507558/how-do-i-randomly-select-numbers-with-a-specified-bias-toward-a-particular-num

标签

python

numpy

scipy