Why do people say there is modulo bias when using a random number generator?

前端未结

关注

 10  1221

一整个雨季 2020-11-21 05:48

I have seen this question asked a lot but never seen a true concrete answer to it. So I am going to post one here which will hopefully help people understand why exactly the

10条回答

北荒 (楼主)

2020-11-21 06:36
As the accepted answer indicates, "modulo bias" has its roots in the low value of RAND_MAX. He uses an extremely small value of RAND_MAX (10) to show that if RAND_MAX were 10, then you tried to generate a number between 0 and 2 using %, the following outcomes would result:
```
rand() % 3   // if RAND_MAX were only 10, gives
output of rand()   |   rand()%3
0                  |   0
1                  |   1
2                  |   2
3                  |   0
4                  |   1
5                  |   2
6                  |   0
7                  |   1
8                  |   2
9                  |   0
```
So there are 4 outputs of 0's (4/10 chance) and only 3 outputs of 1 and 2 (3/10 chances each).

So it's biased. The lower numbers have a better chance of coming out.

But that only shows up so obviously when RAND_MAX is small. Or more specifically, when the number your are modding by is large compared to RAND_MAX.

A much better solution than looping (which is insanely inefficient and shouldn't even be suggested) is to use a PRNG with a much larger output range. The Mersenne Twister algorithm has a maximum output of 4,294,967,295. As such doing MersenneTwister::genrand_int32() % 10 for all intents and purposes, will be equally distributed and the modulo bias effect will all but disappear.
0 讨论(0)

查看其它10个回答
发布评论:

提交评论
- 加载中...