Weighted random selection from array

后端 未结 13 1769
醉酒成梦
醉酒成梦 2020-11-28 03:13

I would like to randomly select one element from an array, but each element has a known probability of selection.

All chances together (within the array) sums to 1.<

相关标签:
13条回答
  • 2020-11-28 03:17

    This can be done in O(1) expected time per sample as follows.

    Compute the CDF F(i) for each element i to be the sum of probabilities less than or equal to i.

    Define the range r(i) of an element i to be the interval [F(i - 1), F(i)].

    For each interval [(i - 1)/n, i/n], create a bucket consisting of the list of the elements whose range overlaps the interval. This takes O(n) time in total for the full array as long as you are reasonably careful.

    When you randomly sample the array, you simply compute which bucket the random number is in, and compare with each element of the list until you find the interval that contains it.

    The cost of a sample is O(the expected length of a randomly chosen list) <= 2.

    0 讨论(0)
  • 2020-11-28 03:18

    Ruby solution using the pickup gem:

    require 'pickup'
    
    chances = {0=>80, 1=>20}
    picker = Pickup.new(chances)
    

    Example:

    5.times.collect {
      picker.pick(5)
    }
    

    gave output:

    [[0, 0, 0, 0, 0], 
     [0, 0, 0, 0, 0], 
     [0, 0, 0, 1, 1], 
     [0, 0, 0, 0, 0], 
     [0, 0, 0, 0, 1]]
    
    0 讨论(0)
  • 2020-11-28 03:20

    I have found this article to be the most useful at understanding this problem fully. This stackoverflow question may also be what you're looking for.


    I believe the optimal solution is to use the Alias Method (wikipedia). It requires O(n) time to initialize, O(1) time to make a selection, and O(n) memory.

    Here is the algorithm for generating the result of rolling a weighted n-sided die (from here it is trivial to select an element from a length-n array) as take from this article. The author assumes you have functions for rolling a fair die (floor(random() * n)) and flipping a biased coin (random() < p).

    Algorithm: Vose's Alias Method

    Initialization:

    1. Create arrays Alias and Prob, each of size n.
    2. Create two worklists, Small and Large.
    3. Multiply each probability by n.
    4. For each scaled probability pi:
      1. If pi < 1, add i to Small.
      2. Otherwise (pi ≥ 1), add i to Large.
    5. While Small and Large are not empty: (Large might be emptied first)
      1. Remove the first element from Small; call it l.
      2. Remove the first element from Large; call it g.
      3. Set Prob[l]=pl.
      4. Set Alias[l]=g.
      5. Set pg := (pg+pl)−1. (This is a more numerically stable option.)
      6. If pg<1, add g to Small.
      7. Otherwise (pg ≥ 1), add g to Large.
    6. While Large is not empty:
      1. Remove the first element from Large; call it g.
      2. Set Prob[g] = 1.
    7. While Small is not empty: This is only possible due to numerical instability.
      1. Remove the first element from Small; call it l.
      2. Set Prob[l] = 1.

    Generation:

    1. Generate a fair die roll from an n-sided die; call the side i.
    2. Flip a biased coin that comes up heads with probability Prob[i].
    3. If the coin comes up "heads," return i.
    4. Otherwise, return Alias[i].
    0 讨论(0)
  • 2020-11-28 03:20

    This is a PHP code I used in production:

    /**
     * @return \App\Models\CdnServer
    */
    protected function selectWeightedServer(Collection $servers)
    {
        if ($servers->count() == 1) {
            return $servers->first();
        }
    
        $totalWeight = 0;
    
        foreach ($servers as $server) {
            $totalWeight += $server->getWeight();
        }
    
        // Select a random server using weighted choice
        $randWeight = mt_rand(1, $totalWeight);
        $accWeight = 0;
    
        foreach ($servers as $server) {
            $accWeight += $server->getWeight();
    
            if ($accWeight >= $randWeight) {
                return $server;
            }
        }
    }
    
    0 讨论(0)
  • 2020-11-28 03:23

    An example in ruby

    #each element is associated with its probability
    a = {1 => 0.25 ,2 => 0.5 ,3 => 0.2, 4 => 0.05}
    
    #at some point, convert to ccumulative probability
    acc = 0
    a.each { |e,w| a[e] = acc+=w }
    
    #to select an element, pick a random between 0 and 1 and find the first   
    #cummulative probability that's greater than the random number
    r = rand
    selected = a.find{ |e,w| w>r }
    
    p selected[0]
    
    0 讨论(0)
  • 2020-11-28 03:25

    I would imagine that numbers greater or equal than 0.8 but less than 1.0 selects the third element.

    In other terms:

    x is a random number between 0 and 1

    if 0.0 >= x < 0.2 : Item 1

    if 0.2 >= x < 0.8 : Item 2

    if 0.8 >= x < 1.0 : Item 3

    0 讨论(0)
提交回复
热议问题