It seems if it is the same distribution, drawing random samples from numpy.random
is faster than doing so from scipy.stats.-.rvs
. I was wondering what
I ran into this today and just wanted to add some timing details to this question. I saw what joon mentioned where, in particular, random numbers from the normal distribution were much more quickly generated with numpy
than from rvs
in scipy.stats
. As user333700 mentioned there is some overhead with rvs
but if you are generating an array of random values then that gap closes compared to numpy
. Here is a jupyter timing example:
from scipy.stats import norm
import numpy as np
n = norm(0, 1)
%timeit -n 1000 n.rvs(1)[0]
%timeit -n 1000 np.random.normal(0,1)
%timeit -n 1000 a = n.rvs(1000)
%timeit -n 1000 a = [np.random.normal(0,1) for i in range(0, 1000)]
%timeit -n 1000 a = np.random.randn(1000)
This, on my run with numpy
version 1.11.1 and scipy
0.17.0, outputs:
1000 loops, best of 3: 46.8 µs per loop
1000 loops, best of 3: 492 ns per loop
1000 loops, best of 3: 115 µs per loop
1000 loops, best of 3: 343 µs per loop
1000 loops, best of 3: 61.9 µs per loop
So just generating one random sample from rvs
was almost 100x slower than using numpy
directly. However, if you are generating an array of values than the gap closes (115 to 61.9 microseconds).
If you can avoid it, probably don't call rvs
to get one random value a ton of times in a loop.