Weighted random numbers

前端 未结 6 751
情书的邮戳
情书的邮戳 2020-11-22 08:50

I\'m trying to implement a weighted random numbers. I\'m currently just banging my head against the wall and cannot figure this out.

In my project (Hold\'em hand-ran

相关标签:
6条回答
  • 2020-11-22 08:58

    Updated answer to an old question. You can easily do this in C++11 with just the std::lib:

    #include <iostream>
    #include <random>
    #include <iterator>
    #include <ctime>
    #include <type_traits>
    #include <cassert>
    
    int main()
    {
        // Set up distribution
        double interval[] = {1,   2,   3,   4};
        double weights[] =  {  .90, .56, .04};
        std::piecewise_constant_distribution<> dist(std::begin(interval),
                                                    std::end(interval),
                                                    std::begin(weights));
        // Choose generator
        std::mt19937 gen(std::time(0));  // seed as wanted
        // Demonstrate with N randomly generated numbers
        const unsigned N = 1000000;
        // Collect number of times each random number is generated
        double avg[std::extent<decltype(weights)>::value] = {0};
        for (unsigned i = 0; i < N; ++i)
        {
            // Generate random number using gen, distributed according to dist
            unsigned r = static_cast<unsigned>(dist(gen));
            // Sanity check
            assert(interval[0] <= r && r <= *(std::end(interval)-2));
            // Save r for statistical test of distribution
            avg[r - 1]++;
        }
        // Compute averages for distribution
        for (double* i = std::begin(avg); i < std::end(avg); ++i)
            *i /= N;
        // Display distribution
        for (unsigned i = 1; i <= std::extent<decltype(avg)>::value; ++i)
            std::cout << "avg[" << i << "] = " << avg[i-1] << '\n';
    }
    

    Output on my system:

    avg[1] = 0.600115
    avg[2] = 0.373341
    avg[3] = 0.026544
    

    Note that most of the code above is devoted to just displaying and analyzing the output. The actual generation is just a few lines of code. The output demonstrates that the requested "probabilities" have been obtained. You have to divide the requested output by 1.5 since that is what the requests add up to.

    0 讨论(0)
  • 2020-11-22 09:00

    There is a straightforward algorithm for picking an item at random, where items have individual weights:

    1) calculate the sum of all the weights

    2) pick a random number that is 0 or greater and is less than the sum of the weights

    3) go through the items one at a time, subtracting their weight from your random number, until you get the item where the random number is less than that item's weight

    Pseudo-code illustrating this:

    int sum_of_weight = 0;
    for(int i=0; i<num_choices; i++) {
       sum_of_weight += choice_weight[i];
    }
    int rnd = random(sum_of_weight);
    for(int i=0; i<num_choices; i++) {
      if(rnd < choice_weight[i])
        return i;
      rnd -= choice_weight[i];
    }
    assert(!"should never get here");
    

    This should be straightforward to adapt to your boost containers and such.


    If your weights are rarely changed but you often pick one at random, and as long as your container is storing pointers to the objects or is more than a few dozen items long (basically, you have to profile to know if this helps or hinders), then there is an optimisation:

    By storing the cumulative weight sum in each item you can use a binary search to pick the item corresponding to the pick weight.


    If you do not know the number of items in the list, then there's a very neat algorithm called reservoir sampling that can be adapted to be weighted.

    0 讨论(0)
  • 2020-11-22 09:18

    Choose a random number on [0,1), which should be the default operator() for a boost RNG. Choose the item with cumulative probability density function >= that number:

    template <class It,class P>
    It choose_p(It begin,It end,P const& p)
    {
        if (begin==end) return end;
        double sum=0.;
        for (It i=begin;i!=end;++i)
            sum+=p(*i);
        double choice=sum*random01();
        for (It i=begin;;) {
            choice -= p(*i);
            It r=i;
            ++i;
            if (choice<0 || i==end) return r;
        }
        return begin; //unreachable
    }
    

    Where random01() returns a double >=0 and <1. Note that the above doesn't require the probabilities to sum to 1; it normalizes them for you.

    p is just a function assigning a probability to an item in the collection [begin,end). You can omit it (or use an identity) if you just have a sequence of probabilities.

    0 讨论(0)
  • 2020-11-22 09:22

    If your weights change more slowly than they are drawn, C++11 discrete_distribution is going to be the easiest:

    #include <random>
    #include <vector>
    std::vector<double> weights{90,56,4};
    std::discrete_distribution<int> dist(std::begin(weights), std::end(weights));
    std::mt19937 gen;
    gen.seed(time(0));//if you want different results from different runs
    int N = 100000;
    std::vector<int> samples(N);
    for(auto & i: samples)
        i = dist(gen);
    //do something with your samples...
    

    Note, however, that the c++11 discrete_distribution computes all of the cumulative sums on initialization. Usually, you want that because it speeds up the sampling time for a one time O(N) cost. But for a rapidly changing distribution it will incur a heavy calculation (and memory) cost. For instance if the weights represented how many items there are and every time you draw one, you remove it, you will probably want a custom algorithm.

    Will's answer https://stackoverflow.com/a/1761646/837451 avoids this overhead but will be slower to draw from than the C++11 because it can't use binary search.

    To see that it does this, you can see the relevant lines (/usr/include/c++/5/bits/random.tcc on my Ubuntu 16.04 + GCC 5.3 install):

      template<typename _IntType>
        void
        discrete_distribution<_IntType>::param_type::
        _M_initialize()
        {
          if (_M_prob.size() < 2)
            {
              _M_prob.clear();
              return;
            }
    
          const double __sum = std::accumulate(_M_prob.begin(),
                                               _M_prob.end(), 0.0);
          // Now normalize the probabilites.
          __detail::__normalize(_M_prob.begin(), _M_prob.end(), _M_prob.begin(),
                                __sum);
          // Accumulate partial sums.
          _M_cp.reserve(_M_prob.size());
          std::partial_sum(_M_prob.begin(), _M_prob.end(),
                           std::back_inserter(_M_cp));
          // Make sure the last cumulative probability is one.
          _M_cp[_M_cp.size() - 1] = 1.0;
        }
    
    0 讨论(0)
  • 2020-11-22 09:22

    What I do when I need to weight numbers is using a random number for the weight.

    For example: I need that generate random numbers from 1 to 3 with the following weights:

    • 10% of a random number could be 1
    • 30% of a random number could be 2
    • 60% of a random number could be 3

    Then I use:

    weight = rand() % 10;
    
    switch( weight ) {
    
        case 0:
            randomNumber = 1;
            break;
        case 1:
        case 2:
        case 3:
            randomNumber = 2;
            break;
        case 4:
        case 5:
        case 6:
        case 7:
        case 8:
        case 9:
            randomNumber = 3;
            break;
    }
    

    With this, randomly it has 10% of the probabilities to be 1, 30% to be 2 and 60% to be 3.

    You can play with it as your needs.

    Hope I could help you, Good Luck!

    0 讨论(0)
  • 2020-11-22 09:22

    Build a bag (or std::vector) of all the items that can be picked.
    Make sure that the number of each items is proportional to your weighting.

    Example:

    • 1 60%
    • 2 35%
    • 3 5%

    So have a bag with 100 items with 60 1's, 35 2's and 5 3's.
    Now randomly sort the bag (std::random_shuffle)

    Pick elements from the bag sequentially until it is empty.
    Once empty re-randomize the bag and start again.

    0 讨论(0)
提交回复
热议问题