问题
This question relates to my previous question here and the data set presented in the paper A New Generalization of Linear Exponential Distribution: Theory and Application. For this data, adapting the code proposed by Ben Bolker, we have
library(stats4)
library(bbmle)
x <- scan(textConnection("115 181 255 418 441 461 516 739 743 789
807 865 924 983 1024 1062 1063 1165 1191 1222 1222 1251 1277 1290 1357 1369 1408 1455 1478 1549 1578 1578 1599 1603 1605 1696 1735 1799 1815 1852"))
dd <- data.frame(x)
dLE <- function(x,lambda,theta,log=TRUE){
r <- log(lambda+theta*x)-(lambda*x+(theta/2)*x^2)
if (log) return(r) else return(exp(r))
}
svec <- list(lambda=0.0009499,theta=0.000002)
m1 <- mle2( x ~ dLE(lambda,theta),
data=dd,
start=svec,
control=list(parscale=unlist(svec)))
coef(m1)
which returns several errors (NaNs produced) and values for the mles which are quite different to those given in Table 2 of the paper. Why is this so and how can it be rectified?
回答1:
After some exploration, my opinion is that the paper simply has incorrect results. The results I get from optim()
produce results that look much better than the ones reported in the paper. I could always be missing something; I would suggest that you contact the corresponding author.
(The warnings are not necessarily a problem - they mean that the optimizer has tried some combinations that lead to taking logs of negative numbers along the way, which doesn't mean the final result is wrong - but I agree that it's always a good idea to resolve warnings in case they're somehow messing up the result.)
preliminaries
library(bbmle)
## load data, in a format as similar to original table
## as possible (looking for typos)
x <- scan(textConnection("115 181 255 418 441 461 516 739 743 789
807 865 924 983 1024 1062 1063 1165 1191 1222
1222 1251 1277 1290 1357 1369 1408 1455 1478 1549
1578 1578 1599 1603 1605 1696 1735 1799 1815 1852"))
dd <- data.frame(x)
## parameters listed in table 2
svec <- list(lambda=9.499e-4,theta=2e-6)
functions
## PDF (as above)
dLE <- function(x,lambda,theta,log=TRUE){
r <- log(lambda+theta*x)-(lambda*x+(theta/2)*x^2)
if (log) return(r) else return(exp(r))
}
## CDF (for checking)
pLE <- function(x,lambda,theta) {
1-exp(-(lambda*x+(theta/2)*x^2))
}
fit model
I used method="L-BFGS-B"
, because it makes it easier to set lower bounds on the parameters (which avoids the warnings).
m1 <- mle2( x ~ dLE(lambda,theta),
data=dd,
start=svec,
control=list(parscale=unlist(svec)),
method="L-BFGS-B",
lower=c(0,0))
results
coef(m1)
## lambda theta
## 0.000000e+00 1.316733e-06
-logLik(m1) ## 305.99 (much better than 335, reported in the paper)
graph
Let's double-check by seeing if we can replicate this figure from the paper:
png("SO55032275.png")
par(las=1)
plot(ecdf(dd$x),col="red")
with(svec,curve(pLE(x,lambda,theta),add=TRUE,col=1))
with(as.list(coef(m1)),curve(pLE(x,lambda,theta),add=TRUE,col=3,lty=2))
legend("topleft",col=c(2,1,3),lty=c(NA,1,3),pch=c(16,NA,NA),
c("ecdf","paper (lam=9e-4, th=2e-6)","ours (lam=0, th=1.3e-6)"))
dev.off()
The ecdf and the CDF drawn with the parameters from the paper match; the CDF drawn with the parameters estimated here is much better (in fact it looks better, and has a lower log-likelihood, than KLE fit reported in the paper). I conclude there's something badly wrong with the fits in the paper.
来源:https://stackoverflow.com/questions/55032275/nan-errors-with-bbmle