Difference between random draws from scipy.stats…rvs and numpy.random

前端 未结 2 1280
夕颜
夕颜 2021-02-02 16:40

It seems if it is the same distribution, drawing random samples from numpy.random is faster than doing so from scipy.stats.-.rvs. I was wondering what

2条回答
  •  南方客
    南方客 (楼主)
    2021-02-02 17:06

    I ran into this today and just wanted to add some timing details to this question. I saw what joon mentioned where, in particular, random numbers from the normal distribution were much more quickly generated with numpy than from rvs in scipy.stats. As user333700 mentioned there is some overhead with rvs but if you are generating an array of random values then that gap closes compared to numpy. Here is a jupyter timing example:

    from scipy.stats import norm
    import numpy as np
    
    n = norm(0, 1)
    %timeit -n 1000 n.rvs(1)[0]
    %timeit -n 1000 np.random.normal(0,1)
    
    %timeit -n 1000 a = n.rvs(1000)
    %timeit -n 1000 a = [np.random.normal(0,1) for i in range(0, 1000)]
    %timeit -n 1000 a = np.random.randn(1000)
    

    This, on my run with numpy version 1.11.1 and scipy 0.17.0, outputs:

    1000 loops, best of 3: 46.8 µs per loop
    1000 loops, best of 3: 492 ns per loop
    1000 loops, best of 3: 115 µs per loop
    1000 loops, best of 3: 343 µs per loop
    1000 loops, best of 3: 61.9 µs per loop
    

    So just generating one random sample from rvs was almost 100x slower than using numpy directly. However, if you are generating an array of values than the gap closes (115 to 61.9 microseconds).

    If you can avoid it, probably don't call rvs to get one random value a ton of times in a loop.

提交回复
热议问题