SciPy: generating custom random variable from PMF

问题

I'm trying to generate random variables according to a certain ugly distribution, in Python. I have an explicit expression for the PMF, but it involves some products which makes it unpleasant to obtain and invert the CDF (see below code for explicit form of PMF).

In essence, I'm trying to define a random variable in Python by its PMF and then have built-in code do the hard work of sampling from the distribution. I know how to do this if the support of the RV is finite, but here the support is countably infinite.

The code I am currently trying to run as per @askewchan's advice below is:

import scipy as sp
import numpy as np

class x_gen(sp.stats.rv_discrete):
    def _pmf(self,k,param):
        num = np.arange(1+param, k+param, 1)
        denom = np.arange(3+2*param, k+3+2*param, 1)

        p = (2+param)*(np.prod(num)/np.prod(denom))

        return p

pa_limit = limitrv_gen()
print pa_limit.rvs(alpha,n=1)

However, this returns the error while running:

File "limiting_sim.py", line 42, in _pmf
    num = np.arange(1+param, k+param, 1)
TypeError: only length-1 arrays can be converted to Python scalars

Basically, it seems that the np.arange() list isn't working somehow inside the def _pmf() function. I'm at a loss to see why. Can anyone enlighten me here and/or point out a fix?

EDIT 1: cleared up some questions by askewchan, edits reflected above.

EDIT 2: askewchan suggested an interesting approximation using the factorial function, but I'm looking more for an exact solution such as the one that I'm trying to get work with np.arange.

回答1:

You should be able to subclass rv_discrete like so:

class mydist_gen(rv_discrete):
    def _pmf(self, n, param):
        return yourpmf(n, param)

Then you can create a distribution instance with:

mydist = mydist_gen()

And generate samples with:

mydist.rvs(param, size=1000)

Or you can then create a frozen distribution object with:

mydistp = mydist(param)

And finally generate samples with:

mydistp.rvs(1000)

With your example, this should work, since factorial automatically broadcasts. But, it might fail for large enough alpha:

import scipy as sp
import numpy as np
from scipy.misc import factorial

class limitrv_gen(sp.stats.rv_discrete):
    def _pmf(self, k, alpha):
        #num = np.prod(np.arange(1+alpha, k+alpha))
        num = factorial(k+alpha-1) / factorial(alpha)
        #denom = np.prod(np.arange(3+2*alpha, k+3+2*alpha))
        denom = factorial(k + 2 + 2*alpha) / factorial(2 + 2*alpha)

        return (2+alpha) * num / denom

pa_limit = limitrv_gen()
alpha = 100
pa_limit.rvs(alpha, size=10)

来源：https://stackoverflow.com/questions/23276631/scipy-generating-custom-random-variable-from-pmf

标签

python

numpy

scipy