Best practices for seeding random and numpy.random in the same program

前端 未结 1 1487
庸人自扰
庸人自扰 2021-01-14 01:30

In order to make random simulations we run reproducible later, my colleagues and I often explicitly seed the random or numpy.random modules\' rando

相关标签:
1条回答
  • 2021-01-14 02:24

    I will discuss some guidelines on how multiple pseudorandom number generators (PRNGs) should be seeded. I assume you're not using random numbers for information security purposes (if you are, only a cryptographic RNG is appropriate and this advice doesn't apply).

    • To reduce the risk of correlated random numbers, you can use PRNG algorithms, such as SFC and other so-called "counter-based" PRNGs (Salmon et al., "Parallel Random Numbers: As Easy as 1, 2, 3", 2011), that support independent "streams" of random numbers. There are other strategies as well, and I explain more about this in "Seeding Multiple Processes".
    • If you can use NumPy 1.17, note that that version introduced a new PRNG system and added SFC (SFC64) to its repertoire of PRNGs. For NumPy-specific advice on parallel random generation, see "Parallel Random Number Generation" in the NumPy documentation.
    • You should avoid seeding PRNGs (especially several at once) with timestamps.
    • You mentioned this question in a comment, when I started writing this answer. The advice there is not to seed multiple instances of the same kind of PRNG. This advice, however, doesn't apply as much if the seeds are chosen to be unrelated to each other, or if a PRNG with a very big state (such as Mersenne Twister) or a PRNG that gives each seed its own nonoverlapping random number sequence (such as SFC) is used. The accepted answer there (at the time of this writing) demonstrates what happens when multiple instances of .NET's System.Random, with sequential seeds, are used, but not necessarily what happens with PRNGs of a different design, PRNGs of multiple designs, or PRNGs initialized with unrelated seeds. Moreover, .NET's System.Random is a poor choice for a PRNG precisely because it allows only seeds no more than 32 bits long (so the number of random sequences it can produce is limited), and also because it has implementation bugs (if I understand correctly) that have been preserved for backward compatibility.
    0 讨论(0)
提交回复
热议问题