A weighted version of random.choice

后端 未结 25 1945
闹比i
闹比i 2020-11-21 06:29

I needed to write a weighted version of random.choice (each element in the list has a different probability for being selected). This is what I came up with:



        
相关标签:
25条回答
  • 2020-11-21 06:51

    If you don't mind using numpy, you can use numpy.random.choice.

    For example:

    import numpy
    
    items  = [["item1", 0.2], ["item2", 0.3], ["item3", 0.45], ["item4", 0.05]
    elems = [i[0] for i in items]
    probs = [i[1] for i in items]
    
    trials = 1000
    results = [0] * len(items)
    for i in range(trials):
        res = numpy.random.choice(items, p=probs)  #This is where the item is selected!
        results[items.index(res)] += 1
    results = [r / float(trials) for r in results]
    print "item\texpected\tactual"
    for i in range(len(probs)):
        print "%s\t%0.4f\t%0.4f" % (items[i], probs[i], results[i])
    

    If you know how many selections you need to make in advance, you can do it without a loop like this:

    numpy.random.choice(items, trials, p=probs)
    
    0 讨论(0)
  • 2020-11-21 06:51

    Crude, but may be sufficient:

    import random
    weighted_choice = lambda s : random.choice(sum(([v]*wt for v,wt in s),[]))
    

    Does it work?

    # define choices and relative weights
    choices = [("WHITE",90), ("RED",8), ("GREEN",2)]
    
    # initialize tally dict
    tally = dict.fromkeys(choices, 0)
    
    # tally up 1000 weighted choices
    for i in xrange(1000):
        tally[weighted_choice(choices)] += 1
    
    print tally.items()
    

    Prints:

    [('WHITE', 904), ('GREEN', 22), ('RED', 74)]
    

    Assumes that all weights are integers. They don't have to add up to 100, I just did that to make the test results easier to interpret. (If weights are floating point numbers, multiply them all by 10 repeatedly until all weights >= 1.)

    weights = [.6, .2, .001, .199]
    while any(w < 1.0 for w in weights):
        weights = [w*10 for w in weights]
    weights = map(int, weights)
    
    0 讨论(0)
  • 2020-11-21 06:51

    If you have a weighted dictionary instead of a list you can write this

    items = { "a": 10, "b": 5, "c": 1 } 
    random.choice([k for k in items for dummy in range(items[k])])
    

    Note that [k for k in items for dummy in range(items[k])] produces this list ['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'c', 'b', 'b', 'b', 'b', 'b']

    0 讨论(0)
  • 2020-11-21 06:51

    I needed to do something like this really fast really simple, from searching for ideas i finally built this template. The idea is receive the weighted values in a form of a json from the api, which here is simulated by the dict.

    Then translate it into a list in which each value repeats proportionally to it's weight, and just use random.choice to select a value from the list.

    I tried it running with 10, 100 and 1000 iterations. The distribution seems pretty solid.

    def weighted_choice(weighted_dict):
        """Input example: dict(apples=60, oranges=30, pineapples=10)"""
        weight_list = []
        for key in weighted_dict.keys():
            weight_list += [key] * weighted_dict[key]
        return random.choice(weight_list)
    
    0 讨论(0)
  • 2020-11-21 06:53

    Here's is the version that is being included in the standard library for Python 3.6:

    import itertools as _itertools
    import bisect as _bisect
    
    class Random36(random.Random):
        "Show the code included in the Python 3.6 version of the Random class"
    
        def choices(self, population, weights=None, *, cum_weights=None, k=1):
            """Return a k sized list of population elements chosen with replacement.
    
            If the relative weights or cumulative weights are not specified,
            the selections are made with equal probability.
    
            """
            random = self.random
            if cum_weights is None:
                if weights is None:
                    _int = int
                    total = len(population)
                    return [population[_int(random() * total)] for i in range(k)]
                cum_weights = list(_itertools.accumulate(weights))
            elif weights is not None:
                raise TypeError('Cannot specify both weights and cumulative weights')
            if len(cum_weights) != len(population):
                raise ValueError('The number of weights does not match the population')
            bisect = _bisect.bisect
            total = cum_weights[-1]
            return [population[bisect(cum_weights, random() * total)] for i in range(k)]
    

    Source: https://hg.python.org/cpython/file/tip/Lib/random.py#l340

    0 讨论(0)
  • 2020-11-21 06:53
    import numpy as np
    w=np.array([ 0.4,  0.8,  1.6,  0.8,  0.4])
    np.random.choice(w, p=w/sum(w))
    
    0 讨论(0)
提交回复
热议问题