Generating random numbers on open-open interval (0,1) efficiently

时光毁灭记忆、已成空白 提交于 2019-12-04 08:46:36

You are already there.

The smallest distance between two floats your current generator produces is 1/(2^32).

So, your generator is efectively producing [0,1-1/(2^32)].

1/(2^32) is greater than FLT_MIN.

Thus if you add FLT_MIN to your generator,

float open_open_flt = FLT_MIN + closed_open_flt;

you'll get [FLT_MIN,1-(1/(2^32))+FLT_MIN], which works as a (0,1) generator.

Since the probability of actually observing 0 probability is very small, and checking if a number is equal to 0 is least expensive (as compared to addition or multiplication), I would regenerate the random number repeatedly until it is not equal to 0.

Eric Postpischil

Given a sample x selected randomly from [0, 232), I propose using:

0x1.fffffep-32 * x + 0x1p-25

Reasoning:

  • These values are such that the highest x produces slightly less than 1-2-25 before rounding, so it is rounded to the largest float less than 1, which is 1-2-24. If we made it any larger, some values would round to 1, which we do not want. If we made it smaller, fewer values would round to 1-2-24, so it would be less represented than we desire (more on this below).
  • The values are such that the lowest x produces 2-25. This produces some symmetry: The distribution is compelled to stop at the high side 1-2-25 before rounding, as explained above, so we make it symmetric on the bottom side, stopping at 0+2-25. To some extent, it is as if we are binning the real number line in bins of width 2-24 and then removing the bins centered on 0 and 1 (which extend 2-25 to either side of those numbers).
  • Each bin that we retain contains about the same number of sample values. However, different float values show up in the bins, because the resolution of float varies. It is finer near 0 and coarser near 1. With this arrangement, each bin is about uniformly represented, but the lower bins will have more samples with lower probability each. The overall distribution remains uniform.
  • We could extend the low end so that it is closer to zero. But then, for most d in (0, ½), there would be more samples in (0, d) than in (1-d, 1), so the distribution would be asymmetric.

As you can see, the floating-point format forces some irregularities in a distribution from 0 to 1. This issue has been raised in other Stack Overflow questions but never thoroughly discussed, to my knowledge. Whether it suits your purposes to leave these irregularities as described above depends on your application.

Potential variations:

  • Quantize all the samples so they occur at regularly spaced intervals, 2-24, rather than being finer where the float format is finer.
  • Allow values closer to 1 before rounding but convert them to 1-2-24 after rounding, and lower the bottom endpoint to match. This reduces the excluded segments around 0 and around 1 at the expense of increasing the number of values clumped into 1-2-24 because the resolution is not fine enough for more distinction.
  • Switch to double. Then there is a 1-1 map from original x values to floating-point values, and you can likely get as close to 0 and 1 as desired.

Also, contrary to ElKamina’s answer, floating-point comparison (even to zero) is not generally faster than addition. Comparison requires branching on the result, which is an issue in many modern CPUs.

I'm looking for an efficient way to generate random floating-point numbers on the open-open interval (0,1). I currently have an RNG that generates random integers on the closed-closed interval of [0, (2^32)-1]. I've already created a half-open floating point RNG on the interval [0,1) by simply multiplying my result from the integer RNG by 1/((2^32)-1)

This means that your generator 'tries' to produce 2^32 different values. Problem is, float type is 4 bytes long, thus having less than 2^32 distinct defined values overall. To be precise, there can be only 2^23 values on interval [1/2, 1). Depending on what you need it may be a problem or not.

You may want to use lagged Fibonacci generator (wiki) with iteration


This already produces numbers from [0,1), given that initial values belong to that interval and may be good enough for your purposes.
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!