Difference between runif and sample in R?

In terms of probability distribution they use? I know that runif gives fractional numbers and sample gives whole numbers, but what I am interested in is if sample also use the 'uniform probability distribution'?

Consider the following code and output:

> set.seed(1)
> round(runif(10,1,100))
 [1] 27 38 58 91 21 90 95 66 63  7
> set.seed(1)
> sample(1:100, 10, replace=TRUE)
 [1] 27 38 58 91 21 90 95 67 63  7

This strongly suggests that when asked to do the same thing, the 2 functions give pretty much the same output (though interestingly it is round that gives the same output rather than floor or ceiling). The main differences are in the defaults and if you don't change those defaults then both would give something called a uniform (though sample would be considered a discrete uniform and by default without replacement).

Edit

The more correct comparison is:

> ceiling(runif(10,0,100))
 [1] 27 38 58 91 21 90 95 67 63  7

instead of using round.

We can even step that up a notch:

> set.seed(1)
> tmp1 <- sample(1:100, 1000, replace=TRUE)
> set.seed(1)
> tmp2 <- ceiling(runif(1000,0,100))
> all.equal(tmp1,tmp2)
[1] TRUE

Of course if the probs argument to sample is used (with not all values equal), then it will no longer be uniform.

sample samples from a fixed set of inputs, and if a length-1 input is passed as the first argument, returns an integer output(s).

On the other hand, runif returns a sample from a real-valued range.

 > sample(c(1,2,3), 1)
 [1] 2
 > runif(1, 1, 3)
 [1] 1.448551

sample() runs faster than ceiling(runif()) This is useful to know if doing many simulations or bootstrapping.

Crude time trial script that time tests 4 equivalent scripts:

n<- 100                     # sample size
m<- 10000                   # simulations
system.time(sample(n, size=n*m, replace =T))  # faster than ceiling/runif 
system.time(ceiling(runif(n*m, 0, n)))
system.time(ceiling(n * runif(n*m)))
system.time(floor(runif(n*m, 1, n+1)))

The proportional time advantage increases with n and m but watch you don't fill memory!

BTW Don't use round() to convert uniformly distributed continuous to uniformly distributed integer since terminal values get selected only half the time they should.

来源：https://stackoverflow.com/questions/26978281/difference-between-runif-and-sample-in-r

标签

random-sample