问题
I have a number of snowfall observations:
x <- c(98.044, 107.696, 146.050, 102.870, 131.318, 170.434, 84.836, 154.686,
162.814, 101.854, 103.378, 16.256)
and I was told that it follows normal distribution with known standard deviation at 25.4 but unknown mean mu
. I have to make inference on mu
using Bayesian Formula.
This is information on prior of mu
mean of snow | 50.8 | 76.2 | 101.6 | 127.0 | 152.4 | 177.8
---------------------------------------------------------------
probability | 0.1 | 0.15 | 0.25 |0.25 | 0.15 | 0.1
---------------------------------------------------------------
The following is what I have tried so far, but the final line about post
does not work correctly. The resulting plot justs give a horizontal line.
library(LearnBayes)
midpts <- c(seq(50.8, 177.8, 30))
prob <- c(0.1, 0.15, 0.25, 0.25, 0.15, 0.1)
p <- seq(50, 180, length = 40000)
histp <- histprior(p, midpts, prob)
plot(p, histp, type = "l")
# posterior density
post <- round(histp * dnorm(x, 115, 42) / sum(histp * dnorm(x, 115, 42)), 3)
plot(p, post, type = "l")
回答1:
My first suggestion is, make sure you understand the statistics behind this. When I saw your
post <- round(histp * dnorm(x, 115, 42) / sum(histp * dnorm(x, 115, 42)), 3)
I reckoned you have messed up several concepts. This appears to be Bayes Formula, but you have wrong code for the likelihood. The correct likelihood function is
## likelihood function: `L(obs | mu)`
## standard error is known (to make problem easy) at 25.4
Lik <- function (obs, mu) prod(dnorm(obs, mu, 25.4))
Note, mu
is a unknown, so it should be a variable of this function; also, likelihood is the product of all individual probability density at observations. Now, we can evaluate likelihood for example, at mu = 100
by
Lik(x, 100)
# [1] 6.884842e-30
For successful R implementation, we need a vectorized version for function Lik
. That is, a function that can evaluate on a vector input for mu
, rather than just a scalar input. I will just use sapply
for vectorization:
vecLik <- function (obs, mu) sapply(mu, Lik, obs = obs)
Let's try
vecLik(x, c(80, 90, 100))
# [1] 6.248416e-34 1.662366e-31 6.884842e-30
Now it is time to obtain prior distribution for mu
. In principle this is a continuous function, but looks like we want a discrete approximation to it, using histprior
from R package LearnBayes
.
## prior distribution for `mu`: `prior(mu)`
midpts <- c(seq(50.8, 177.8, 30))
prob <- c(0.1, 0.15, 0.25, 0.25, 0.15, 0.1)
mu_grid <- seq(50, 180, length = 40000) ## a grid of `mu` for discretization
library(LearnBayes)
prior_mu_grid <- histprior(mu_grid, midpts, prob) ## discrete prior density
plot(mu_grid, prior_mu_grid, type = "l")
Before applying Baye's Formula, we first work out the normalizing constant NC
on the denominator. This would be an integral of Lik(obs | mu) * prior(mu)
. But as we have discrete approximation for prior(mu)
, we use Riemann sum to approximate this integral.
delta <- mu_grid[2] - mu_grid[1] ## division size
NC <- sum(vecLik(x, mu_grid) * prior_mu_grid * delta) ## Riemann sum
# [1] 2.573673e-28
Great, all being ready, we can use Bayes Formula:
posterior(mu | obs) = Lik(obs | mu) * prior(mu) / NC
Again, as prior(mu)
is discretized, posterior(mu)
is discretized, too.
post_mu <- vecLik(x, mu_grid) * prior_mu_grid / NC
Haha, let's sketch posterior of mu
to see the inference result:
plot(mu_grid, post_mu, type = "l")
Wow, this is beautiful!!
来源:https://stackoverflow.com/questions/40189329/toy-r-code-on-bayesian-inference-for-mean-of-a-normal-distribution-data-of-snow