confusion on 2 dimension kernel density estimation in R

大兔子大兔子 提交于 2019-12-04 20:31:16

What is a kernel density estimator? Essentially it fits a little normal density curve over every point (the center of the normal density being that point) of the data and then adds up all little normal densities to a kernel density estimator.

For the sake of illustration I will add an image of a 1 dimensional kernel density estimator from one of your links.

What about 2 dimensional kernel densities?

# library(MASS)
b <- log10(rgamma(1000, 6, 3))
a <- log10((rweibull(1000, 8, 2)))
# a and b contain 1000 values each. 

density <- kde2d(a,b,n=100) 

The function creates a grid from min(a) to max(a) and from min(b) to max(b). Instead of fitting a tiny 1D normal density over every value in a or b, kde2d now fits a tiny 2D normal density over every point in the grid. Just like in the 1 dimensional case kernel density, it then adds up all density values.

What do the colours mean? As @cel pointed out in the comments: the estimated probability depends on two variables, so we have three axes now (a, b and estimated probability). One way to visualize 3 axes is by using iso-probability contours. This sounds fancy, but it is basically the same as the high/low pressure images we know from the weather forecast.

You are using

filled.contour(density, 
    color.palette = colorRampPalette(c('white', 'blue', 'yellow', 'red', 'darkred')))))

So from low to high, the plot will be coloured white, blue, yellow, red and eventually darkred for the highest values of estimated probability. This results in the following plot:

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!