I am investigating whether it is possible to have two sets of software agree on a sequence of produced pseudo-random numbers. I am as interested in understanding all the pos
Old question, but maybe useful to some future reader: As alluded in the comments, your best bet is to implement this your self and provide interfaces for the different environments such that for a given seed the same results are returned. Why is that necessary? You used "sampling" as an example. There are several steps involved.
Seeding is a non-trivial process. For example R goes as far as to further scramble the provided seed. So unless you user tools use the same method, they will end up with a different seed even when the user supplies the same value.
The actual RNG: Even though in both cases Mersenne-Twister might be used, is it really the same version that is used? R uses a 32bit MT. Maybe Python uses a 64bit version?
Most RNGs give you an unsigned integer (nowadays typically 32 or 64bits). But you will need some distribution of random numbers, e.g. for sampling you need random integers within a given range. There are many methods to go from the integers produced by the RNG to those needed for sampling. In the case of R, you do not even have access to the output value of the RNG. The most fundamental function is R_unif
which returns a double in [0, 1). Again, how to generate such a double is not universally agreed on. And if you need other distribution functions (normal, exponential, ...) you will find quite a few different algorithms for them.
Overall there are to many places where (subtle) differences can creep in.