Difference between runif and sample in R?

ぃ、小莉子 提交于 2019-12-03 15:24:55

Consider the following code and output:

> set.seed(1)
> round(runif(10,1,100))
 [1] 27 38 58 91 21 90 95 66 63  7
> set.seed(1)
> sample(1:100, 10, replace=TRUE)
 [1] 27 38 58 91 21 90 95 67 63  7

This strongly suggests that when asked to do the same thing, the 2 functions give pretty much the same output (though interestingly it is round that gives the same output rather than floor or ceiling). The main differences are in the defaults and if you don't change those defaults then both would give something called a uniform (though sample would be considered a discrete uniform and by default without replacement).

Edit

The more correct comparison is:

> ceiling(runif(10,0,100))
 [1] 27 38 58 91 21 90 95 67 63  7

instead of using round.

We can even step that up a notch:

> set.seed(1)
> tmp1 <- sample(1:100, 1000, replace=TRUE)
> set.seed(1)
> tmp2 <- ceiling(runif(1000,0,100))
> all.equal(tmp1,tmp2)
[1] TRUE

Of course if the probs argument to sample is used (with not all values equal), then it will no longer be uniform.

sample samples from a fixed set of inputs, and if a length-1 input is passed as the first argument, returns an integer output(s).

On the other hand, runif returns a sample from a real-valued range.

 > sample(c(1,2,3), 1)
 [1] 2
 > runif(1, 1, 3)
 [1] 1.448551

sample() runs faster than ceiling(runif()) This is useful to know if doing many simulations or bootstrapping.

Crude time trial script that time tests 4 equivalent scripts:

n<- 100                     # sample size
m<- 10000                   # simulations
system.time(sample(n, size=n*m, replace =T))  # faster than ceiling/runif 
system.time(ceiling(runif(n*m, 0, n)))
system.time(ceiling(n * runif(n*m)))
system.time(floor(runif(n*m, 1, n+1)))

The proportional time advantage increases with n and m but watch you don't fill memory!

BTW Don't use round() to convert uniformly distributed continuous to uniformly distributed integer since terminal values get selected only half the time they should.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!