Erratic seed behavior with rbinom(prob=0.5) in R

前端 未结 2 1322
野性不改
野性不改 2021-01-07 16:12

I have found what I would consider erratic behavior (but for which I hope there is a simple explanation) in R\'s use of seeds in conjunction with rbinom()

2条回答
  •  悲&欢浪女
    2021-01-07 16:47

    I'm going to take a contrarian position on this question and claim that the expectations are not appropriate and are not supported by the documentation. The documentation does not make any claim about what side effects (specifically on .Random.seed) can be expected by calling rbinom, or how those side effects may or may not be the same in various cases.

    rbinom has three parameters: n, size, and prob. Your expectation is that, for a random seed set before calling rbinom, .Random.seed will be the same after calling rbinom for a given n and any values of size and prob (or maybe any finite values of size and prob). You certainly realize that it would be different for different values of n. rbinom doesn't guarantee that or imply that.

    Without knowing the internals of the function, this can't be known; as the other answer showed, the algorithm is different based on the product of size and prob. And the internals may change so these specific details may change.

    At least, in this case, the resulting .Random.seed will be the same after every call of rbinom which has the same n, size and prob. I can construct a pathological function for which this is not even true:

    seedtweak <- function() {
      if(floor(as.POSIXlt(Sys.time())$sec * 10) %% 2) {
        runif(1)
      }
      invisible(NULL)
    }
    

    Basically, this function looks a whether the tenths of the second of the time is odd or even to decided whether or not to draw a random number. Run this function and .Random.seed may or may not change:

    rs <- replicate(10, {
      set.seed(123) 
      seedtweak()
      .Random.seed
    })
    all(apply(rs, 1, function(x) Reduce(`==`, x)))
    

    The best you can (should?) hope for is that a given set of code with all the inputs/parameters the same (including the seed) will always give identical results. Expecting identical results when only most (or only some) of the parameters are the same is not realistic unless all the functions called make those guarantees.

提交回复
热议问题