How dnorm works?

泄露秘密 提交于 2019-12-03 15:01:37

The density returns a number that in itself does not translate directly into a probability. But it gives the height of a curve that, if drawn over the full range of possible numbers, has the area underneath it that adds up to 1.

Consider this. If I make the vector x of evenly spaced numbers from -7.5 to 7.5, 0.1 apart, and get the density of a normal variable with mean 0 and standard deviation 2.5 for each value of x.

x <- seq(from = -7.5, to = 7.55, by = 0.1)
y <- dnorm(x, 0, 2.5)

The approximate value of the area under the curve formed by those densities (which I have stored as y), multiplied by their distance apart (0.1) is nearly 1:

> sum(y * 0.1)
[1] 0.9974739

If you did this properly with calculus rather than approximating it with numbers, it would be exactly one.

Why is this useful? The cumulative area under parts of the curve can be used to estimate the probability of the variable coming anywhere in a particular range, even though as one of your sources points out, the chance of any precise number is technically zero for a continuous variable.

Consider this graphic. The area of the shaded space shows the chance of a variable from your normal distribution (mean zero, standard deviation 2.5) being between -7.5 and 4. This leads to many useful applications.

Made with:

library(ggplot2)

d <- data.frame(x, y)

ggplot(d, aes(x = x, y = y)) +
  geom_line() +
  geom_point() +
  geom_ribbon(fill = "steelblue", aes(ymax = y), ymin = 0, alpha = 0.5, data = subset(d, x <= 4)) +
  annotate("text", x= -4, y = 0.13, label = "Each point is an individual density\nestimate of dnorm(x, 0, 2.5)") +
  annotate("text", x = -.3, y = 0.02, label = "Filled area under the curve shows the cumulative probability\nof getting a number as high as a given x, in this case 4") +
  ggtitle("Density of a random normal variable with mean zero and standard deviation 2.5")
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!