Generation sample using kernel density estimates in r

社会主义新天地 提交于 2021-02-04 16:34:05

问题


I need generate sample from existing data using kernel density estimates in R. In my data missing negative values (and can not be), but in generate sample negative values present.

library(ks)
set.seed(1)
par(mfrow=c(2,1))

x<-rlnorm(100)
hist(x, col="red", freq=F)

y <- rkde(fhat=kde(x=x, h=hpi(x)), n=100)
hist(y, col="green", freq=F)

How to limit the range of the KDE and generated sample?


回答1:


rkde pas a positive argument:

y <- rkde(
  fhat = kde(x=x, h=hpi(x)), 
  n    = 100, 
  positive = TRUE
)

An alternative would be to transform the data (e.g., with a logarithm) before the estimation, to make it unconstrained, and transform it back after the random number generation.

x2 <- log(x)
y2 <- rkde(fhat=kde(x=x2, h=hpi(x2)), n=100)
y <- exp(y2)
hist(y, col="green", freq=F)



回答2:


If you can accept a density estimate that is not a KDE then look at the logspline package. This is a different way to estimate density estimates and there are arguments to set lower (and/or upper) bounds so that the resulting estimate will not go beyond the bound and makes sense near the bound.

Here is a basic example:

set.seed(1)
x<-rlnorm(100)
hist(x, prob=TRUE)

lines(density(x), col='red')

library(ks)
tmp <- kde(x, hpi(x))
lines(tmp$eval.points, tmp$estimate, col='green')

library(logspline)
lsfit <- logspline(x, lbound=0)
curve( dlogspline(x,lsfit), add=TRUE, col='blue' )

curve( dlnorm, add=TRUE, col='orange' )

enter image description here

You can generate new data points from the fitted density using the rlogspline function and there are also plogspline and qlogspline functions.



来源:https://stackoverflow.com/questions/16102048/generation-sample-using-kernel-density-estimates-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!