Performance for drawing numbers from Poisson distribution with low mean

问题

In order to draw random number from a Poisson distribution in C++, it is generally advised to use

RNG_type rng;
std::poisson_distribution<size_t> d(1e-6);
auto r = d(rng);

At each call of the std::poisson_distribution object, an entire sequence of random bits is consumed (e.g. 32 bits with std::mt19937, 64 bits for std::mt19937_64). It strikes me that with such low mean (mean = 1e-6), the vast majority of times, only a few bits are enough to determine that the value to return is 0. The other bits could then be cached for later use.

Assuming that a sequence of bits set to true is associated to a high returned value from the Poisson distribution, when using a mean of 1e-6, any sequence not starting with 19 trues necessarily returns a zero! Indeed,

1 - 1/2^19 < P(0, 1e-6) < 1 - 1/2^20

, where P(n, r) denotes the probability of drawing n from a Poisson distribution with mean r. An algorithm that does not waste bits would use one bit half of the time, two bits a quarter of the times, three bits an eighth of the times, ....

Is there an algorithm out there that can improve performance by consuming as few bits as possible when drawing Poisson numbers? Is there another way to improve performance compared to std::poisson_distribution when we consider a low mean?

In response to @Jarod42's comment who said

Wonder if using fewer bits don't break equiprobability...

I don't think it would break equiprobability. In a vague attempt to test it, I consider the same question with a simple bernoulli distribution. I am sampling true with a probability 1/2^4 and sampling false with a probability 1 - 1/2^4. The function drawWithoutWastingBits stops as soon as it sees a true in the cache and the function drawWastingBits consumes 4 bits whatever these bits are.

#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
#include <random>

bool drawWithoutWastingBits(std::vector<bool>& cache, size_t& cache_index)
{
    /* 
        Get a true with probability 1/2^4 (=1/16=0.0625) and a false otherwise
    */

    size_t nbTrues = 0;
    while (cache[cache_index])
    {
        ++nbTrues;
        ++cache_index;
        if (nbTrues == 4)
        {
            return true;
        }
    }
    ++cache_index;
    return false;
}


bool drawWastingBits(std::vector<bool>& cache, size_t& cache_index)
{
    /* 
        Get a true with probability 1/2^4 (=1/16=0.0625) and a false otherwise
    */

    bool isAnyTrue = false;
    for (size_t i = 0 ; i < 4; ++i)
    {
        if (cache[cache_index])
        {
            isAnyTrue = true;
        }
        ++cache_index;
    }
    return !isAnyTrue;
}

int main()
{
    /*
        Just cache a lot of bits in advance in `cache`. The same sequence of bits will be used by both function.
        I am just caching way enough bits to make sure they don't run out of bits below
        I made sure to have the same number of zeros and ones so that any deviation is caused by the methodology and not by the RNG
    */

    // Produce cache
    std::vector<bool> cache;
    size_t nbBitsToCache = 1e7;
    cache.reserve(nbBitsToCache);
    for (size_t i = 0 ; i < nbBitsToCache/2 ; ++i)
    {
        cache.push_back(false);
        cache.push_back(true);
    }
    // Shuffle cache
    {
        std::mt19937 mt(std::random_device{}());
        std::shuffle(cache.begin(), cache.end(), mt);
    }


    // Draw without wasting bits
    {
        size_t nbDraws = 1e6;
        size_t cache_index = 0;
        std::pair<size_t, size_t> outcomes = {0,0};
        for (size_t r = 0 ; r < nbDraws ; ++r)
        {
            drawWithoutWastingBits(cache, cache_index) ? ++outcomes.first : ++outcomes.second;
            assert(cache_index <= cache.size());
        }   

        assert(outcomes.first + outcomes.second == nbDraws);
        std::cout << "Draw Without Wasting Bits: prob true = " << (double)outcomes.first / nbDraws << "\n";
    }


    // Draw wasting bits
    {
        size_t nbDraws = 1e6;
        size_t cache_index = 0;
        std::pair<size_t, size_t> outcomes = {0,0};
        for (size_t r = 0 ; r < nbDraws ; ++r)
        {
            drawWastingBits(cache, cache_index) ? ++outcomes.first : ++outcomes.second;
            assert(cache_index <= cache.size());
        }   

        assert(outcomes.first + outcomes.second == nbDraws);
        std::cout << "Draw Wit Wasting Bits: prob true = " << (double)outcomes.first / nbDraws << "\n";
    }
}

Possible output

Draw Without Wasting Bits: prob true = 0.062832
Draw Wit Wasting Bits: prob true = 0.062363

回答1:

Devroye's Non-Uniform Random Variate Generation, pp. 505 and 86, mentions an inversion by sequential search algorithm.

Based on that algorithm, if you know the mean is considerably less than 1, then if you generate a uniform random number u in [0, 1], the Poisson variable will be 0 if u <= exp(-mean), and greater than 0 otherwise.

If the mean is low and you can tolerate an approximate distribution, then you can use the following approach (see Appendix A of "The Discrete Gaussian for Differential Privacy"):

Express mean in the form of a rational number, in the form numer/denom. For example, if mean is a fixed value, then numer and denom can be precalculated accordingly, such as at compile time.
Randomly generate a Bernoulli(numer / denom) number (generate 1 with probability numer / denom or 0 otherwise). If 1 was generated this way, repeat this step with Bernoulli(numer / (denom * 2)), Bernoulli(numer / (denom * 3)), and so on until 0 is generated this way. Generate these numbers using an algorithm that minimizes waste of bits, such as the one mentioned in Appendix B of Lumbroso's Fast Dice Roller paper (2013) or the "ZeroToOne" method modified from there and given in my section on Boolean conditions. See also this question.
If step 2 produced an even number of ones, the Poisson variable is exactly 0.
If step 2 produced an odd number of ones, the Poisson variable is greater than 0, and a "slower" algorithm is necessary that samples only Poisson variables greater than 0.

For example, say the mean is 1e-6 (1/1000000), Generate a Bernoulli(1/1000000) number, then Bernoulli(1/2000000), etc. until you generate 0 this way. If an even number of ones were generated, then the Poisson variable is exactly 0. Otherwise, the Poisson variable is 1 or greater and a "slower" algorithm is necessary.

One example is the algorithm below, which is based on the one from pages 505 and 86, but only samples Poisson variables 1 or greater:

METHOD Poisson1OrGreater(mean)
 sum=Math.exp(-mean)
 prod=sum
 u=RNDRANGE(sum, 1)
 i=0
 while i==0 or u>sum
   prod*=mean/(i+1)
   sum+=prod
   i=i+1
 end
 return i
END METHOD

This method, though, is not very robust, especially since it uses numbers close to 1 (where the floating-point space is more sparse) rather than numbers close to 0.

EDIT (May 7):

Note that the sum of n independent Poisson(mean) random numbers is Poisson(mean*n) distributed (p. 501). Thus, the discussion above in this answer applies to a sum of n Poisson random numbers as long as n times their mean remains small. For example, to generate a sum of 1000 Poisson random numbers with a mean of 1e-6, simply generate a single Poisson random number with a mean of 0.001. This will save considerably on calls to the random number generator.

EDIT (May 13): Edited generally.

回答2:

You can generate the time to next event with the equation (-ln U) / λ, where 0 < U ≤ 1 is a uniform random number, and λ is the event rate (aka. 1e-6).

https://preshing.com/20111007/how-to-generate-random-timings-for-a-poisson-process/

来源：https://stackoverflow.com/questions/61614458/performance-for-drawing-numbers-from-poisson-distribution-with-low-mean

标签

c++

performance

random

probability

poisson