问题
I want to integrate a kernel density estimate in order to get a kernel estimate of the cdf.
This is my code:
set.seed(1)
z <- rnorm(250)
pdf <- approxfun(density(z, bw = "SJ"), yleft = 0, yright = 0)
cdf <- function(b) {
integrate(pdf, -Inf, b)$value
}
x <- seq(-20, 20, 0.1)
plot(x, sapply(x, cdf), type = "l", xlab = "x", ylab = "density", ylim= c(0, 1))
Which produces the following plot
As you can see, the cdf drops to zero at ~18, which clearly should not happen.
Why does this happen and how can I avoid it?
回答1:
Use a large finite number for your left integration endpoint, instead of -infinity.
cdf <- function(b)
{
integrate(pdf, -20, b)$value
}
x <- seq(-20, 20, 0.1)
plot(x, sapply(x, cdf), type="l", xlab="x", ylab="density", ylim=c(0, 1))
The reason is basically because R's numerical integration routine isn't that sophisticated, and sometimes fails when infinite endpoints are supplied. (The help says that using explicit infinite intervals can be better than large finite endpoints. In this case, that advice doesn't work.)
来源:https://stackoverflow.com/questions/45000231/kernel-cdf-estimate-integral-drops-to-zero