Why does scipy.integrate.quad fail for some interval of this integral?

问题

To reproduce:

# Use scipy to create random number for f(x) = 2x when x in [0,1] and 0, otherwise
from scipy.stats import rv_continuous
class custom_rv(rv_continuous):
    "custom distribution"
    def _pdf(self, x):
        if x >= 0.0 and x <=1.0:
            return 2*x
        else:
            return 0.0
rv = custom_rv(name='2x')
from scipy.integrate import quad
print(quad(rv._pdf, -10.0, 10.0))
print(quad(rv._pdf, -5.0, 5.0))
print(quad(rv._pdf, -np.inf, np.inf))

Output:

(0.0, 0.0) # for [-10,10]
(1.0, 1.1102230246251565e-15) # for [-5,5]
(1.0, 2.5284034865791227e-09) # for [-inf, inf]

Context:

I'm trying to create a random variable with a custom p.d.f: f(x) = 2*x if x is in [0,1], otherwise f(x) = 0.

This random variable didn't work and I tried to debug by checking the integral of p.d.f using quad.

What I found was that the integral was not consistent. For some intervals like (-inf, inf) and (-5,5), it's 1. However, for intervals like (-10,10), it's evaluated to be zero, which is quite unexpected.

Any idea what went wrong here?

回答1:

Have a look at the quad function documentation, if you go all the way to the bottom you will read:

Be aware that pulse shapes and other sharp features as compared to the size of the integration interval may not be integrated correctly using this method. A simplified example of this limitation is integrating a y-axis reflected step function with many zero values within the integrals bounds.

The example provided is:

>>> y = lambda x: 1 if x<=0 else 0
>>> integrate.quad(y, -1, 1)
(1.0, 1.1102230246251565e-14)
>>> integrate.quad(y, -1, 100)
(1.0000000002199108, 1.0189464580163188e-08)
>>> integrate.quad(y, -1, 10000)
(0.0, 0.0)

So the idea is that your function is not "smooth" enough, that's why you can get surprising results.

来源：https://stackoverflow.com/questions/61005609/why-does-scipy-integrate-quad-fail-for-some-interval-of-this-integral

标签

python

random

scipy

probability-density