Difference between random draws from scipy.stats…rvs and numpy.random

前端 未结 2 1283
夕颜
夕颜 2021-02-02 16:40

It seems if it is the same distribution, drawing random samples from numpy.random is faster than doing so from scipy.stats.-.rvs. I was wondering what

2条回答
  •  走了就别回头了
    2021-02-02 17:22

    scipy.stats.uniform actually uses numpy, here is the corresponding function in stats (mtrand is an alias for numpy.random)

    class uniform_gen(rv_continuous):
        def _rvs(self):
            return mtrand.uniform(0.0,1.0,self._size)
    

    scipy.stats has a bit of overhead for error checking and making the interface more flexible. The speed difference should be minimal as long as you don't call uniform.rvs in a loop for each draw. You can get instead all random draws at once, for example (10 million)

    >>> rvs = stats.uniform.rvs(size=(10000, 1000))
    >>> rvs.shape
    (10000, 1000)
    

    Here is the long answer, that I wrote a while ago:

    The basic random numbers in scipy/numpy are created by Mersenne-Twister PRNG in numpy.random. The random numbers for distributions in numpy.random are in cython/pyrex and are pretty fast.

    scipy.stats doesn't have a random number generator, random numbers are obtained in one of three ways:

    • directly from numpy.random, e.g. normal, t, ... pretty fast

    • random numbers by transformation of other random numbers that are available in numpy.random, also pretty fast because this operates on entire arrays of numbers

    • generic: the only generic generation random number generation is by using the ppf (inverse cdf) to transform uniform random numbers. This is relatively fast if there is an explicit expression for the ppf, but can be very slow if the ppf has to be calculated indirectly. For example if only the pdf is defined, then the cdf is obtained through numerical integration and the ppf is obtained through an equation solver. So a few distributions are very slow.

提交回复
热议问题