问题
Suppose I have 2 data.frame
objects:
df1 <- data.frame(x = 1:100)
df1$y <- 20 + 0.3 * df1$x + rnorm(100)
df2 <- data.frame(x = 1:200000)
df2$y <- 20 + 0.3 * df2$x + rnorm(200000)
I want to do MLE. With df1
everything is ok:
LL1 <- function(a, b, mu, sigma) {
R = dnorm(df1$y - a- b * df1$x, mu, sigma)
-sum(log(R))
}
library(stats4)
mle1 <- mle(LL1, start = list(a = 20, b = 0.3, sigma=0.5),
fixed = list(mu = 0))
> mle1
Call:
mle(minuslogl = LL1, start = list(a = 20, b = 0.3, sigma = 0.5),
fixed = list(mu = 0))
Coefficients:
a b mu sigma
23.89704180 0.07408898 0.00000000 3.91681382
But if I would do the same task with df2
I would receive an error:
LL2 <- function(a, b, mu, sigma) {
R = dnorm(df2$y - a- b * df2$x, mu, sigma)
-sum(log(R))
}
mle2 <- mle(LL2, start = list(a = 20, b = 0.3, sigma=0.5),
fixed = list(mu = 0))
Error in optim(start, f, method = method, hessian = TRUE, ...) :
initial value in 'vmmin' is not finite
How can I overcome it?
回答1:
The value of R
becomes zero at some point; it leads to a non-finite value of the function to be minimized and returns an error.
Using the argument log=TRUE
handles better this issue, see function LL3
below. The following gives some warnings but a result is returned, with parameter estimates close to the true parameters.
require(stats4)
set.seed(123)
e <- rnorm(200000)
x <- 1:200000
df3 <- data.frame(x)
df3$y <- 20 + 0.3 * df3$x + e
LL3 <- function(a, b, mu, sigma) {
-sum(dnorm(df3$y - a- b * df3$x, mu, sigma, log=TRUE))
}
mle3 <- mle(LL3, start = list(a = 20, b = 0.3, sigma=0.5),
fixed = list(mu = 0))
Warning messages:
1: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
2: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
3: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
4: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
5: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
6: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
7: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
8: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
> mle3
Call:
mle(minuslogl = LL3, start = list(a = 20, b = 0.3, sigma = 0.5),
fixed = list(mu = 0))
Coefficients:
a b mu sigma
19.999166 0.300000 0.000000 1.001803
回答2:
I had the same problem when minimizin a log-likelihood function. After some debugging I found that the problem was in my starting values. They caused one specific matrix to have a determinant = 0, which caused an error when a log was taken of it. Therefore, it could not find any "finite" value, but that was because the function returned an error to optim.
Bottomline: consider if your function is not returning an error when you run it using the starting values.
PS.: Marius Hofert is completely right. Never suppress warnings.
来源:https://stackoverflow.com/questions/24383746/mle-error-in-r-initial-value-in-vmmin-is-not-finite