How do I plot a mle2 fit of a model in ggplot2, along with the data?

问题

I created a log likelihood function for the model, and use this with starting values in mle2() to fit the model (see blow), but can't figure out how to plot this model fit over top of the data in ggplot2. I've never posted on this site before, so I'm not sure where to put the data file, but I have one for reproducibility if needed.

I have spent days trying to find an example of specifically what I need to do, and can't find anything relevant. Apparently stat_smooth has most fitting options except mle, none of which I can use for this model. This is a fisheries Ricker stock-recruitment model which is fit with mle assuming log-normal errors.

LL function:

Ricker.LL <- function(a,b) {
  wf<-read.csv("wf_SR data.csv",sep=",",header=T)
  s <- wf$Adult.CPUE.t.1
  r <- wf$YOY.CPUE
  model.pred <- a*s*exp(-(b)*s)
  ndata <- length(s)
  NLL <- -sum(dlnorm(x=s,meanlog=model.pred,sdlog=1,log=TRUE))
  return(NLL)
}

mle2 fit:

mle2(minuslogl=Ricker.LL,start=list(a=0.4515,b=0.2665),method="Nelder-Mead",lower=-Inf,upper=Inf)

Then, I tried to assign predicted values to a new df in order to plot these with geom_line, but got the error:

dat <- predict(fit)

Error : object of type 'symbol' is not subsettable
Error in gfun(object, newdata = newdata, location = location, op = "predict") : 
  can only use predict() if formula specified

So, I tried to include the formula in mle2() before calling predict():

fit<-mle2(YOY.CPUE~a*Adult.CPUE.t.1*exp(-(b)*Adult.CPUE.t.1),data=wf,start=list(a=0.4515,b=0.2665))

and got the error: 'Error in '*' (x=c(....):operator needs one or two arguments.

I just want a plot of the data (s & r), with the associated fit overlain. I have had no problem using nls() and stat_smooth() but must use mle to fit this.

回答1:

Preliminaries:

library(bbmle)
library(ggplot2); theme_set(theme_bw())
rickerfun <- function(x,a,b) {
    a*x*exp(-b*x)
}

There are a variety of ways to do this. The main difference in the predictions turns out to be whether we predict the median of the response distribution (equivalent to the geometric mean in the case of the log-Normal) or its mean ...

As a direct maximum likelihod estimate, with the log-mean equal to Ricker(A(t),a,b) [the sdlog parameter is estimated on the log scale; use lower bounds on a and b to avoid nuisance warning messages]

m1 <- mle2(YOY.CPUE ~ dlnorm(meanlog=log(rickerfun(Adult.CPUE.t.1,a,b)),
                         sdlog=exp(logsdlog)),
       method="L-BFGS-B",
       lower=c(a=1e-2,b=1e-2,logsdlog=-10),
       start=list(a=1,b=1,logsdlog=0),
       data=dat)

Also needed to get predictions from mle2():

slnorm <- function(meanlog, sdlog) {
   list(title="Log-normal",
        median=exp(meanlog),
        mean=exp(meanlog+sdlog^2/2))
}

As a log-linear model (if log(Y) = loga + log(A) - b*A, exponentiating both sides shows Y = a*A*exp(-b*A); Normal errors on the log scale correspond to log-Normal errors on the original scale)

 m2 <- lm(log(YOY.CPUE) ~ Adult.CPUE.t.1+ offset(log(Adult.CPUE.t.1)),
          data=dat)

As a generalized linear model with a log link and a Gamma response [the Gamma distribution has the same mean-variance relationship as the log-Normal, and is often an adequate approximation]

 m3 <- glm(YOY.CPUE ~ Adult.CPUE.t.1+ offset(log(Adult.CPUE.t.1)),
      data=dat,
      family=Gamma(link="log"))

For comparison, the nls() fit (this should also be equivalent to m3 with family=gaussian(link="log"):

m4 <- nls(YOY.CPUE ~ a*Adult.CPUE.t.1*exp(-b*Adult.CPUE.t.1),
      start=list(a=0.45, b=0.27),
      data=dat)

Compute predictions for all models:

predframe <- data.frame(Adult.CPUE.t.1=seq(0,5.5,length=51))
predframe$mle2 <- predict(m1,newdata=predframe)
predframe$mle_med <- predict(m1,newdata=predframe,location="median")
predframe$loglm <- exp(predict(m2,newdata=predframe))
predframe$glm <- predict(m3,newdata=predframe,type="response")
predframe$nls <- predict(m4,newdata=predframe)

Melt for ggplot convenience:

predframe_m <- reshape2::melt(predframe,id.var="Adult.CPUE.t.1",
                              variable.name="model",
                              value.name="YOY.CPUE")

Plot:

library(ggplot2)
ggplot(dat,aes(Adult.CPUE.t.1,YOY.CPUE))+ geom_point() +
    geom_smooth(method="glm",
                formula=y~x + offset(log(x)),
                method.args=list(family=Gamma(link="log")))+
    geom_point(data=predframe_m,aes(colour=model,shape=model))

Take-home messages:

as stated above, the biggest difference in predictions is between predictions of the mean ("mle2") and of the median/geometric mean ("mle2_med" and "loglm"). The size of the difference surprised me: going from median to mean is equivalent to multiplying all predictions by exp(sdlog^2/2)
the "mle2_med" and "loglm" predictions are identical (it's hard to see, but the green squares are exactly on top of the yellow triangles)
the GLM prediction and the built-in geom_smooth() prediction are identical (they should be!)
the coefficients of the mle2 and loglm models are identical, up to transformations:

all.equal(coef(m1)[["a"]],exp(coef(m2)[["(Intercept)"]]), tol=1e-5)
all.equal(coef(m1)[["b"]],-coef(m2)[["Adult.CPUE.t.1"]], tol=1e-4)

回答2:

Here is the model fit with age-diversity considered. I had to remove the lower bounds to get it to fit, and I get a 'longer object length not multiple of shorter object length' error, but it still seems to work. Thanks again for your help.

rickerH <- function(x,z,a,b,f) {
  a*x*exp(-b*x)*exp(f*z)
}

fit<-mle2(YOY.CPUE~dlnorm(meanlog=log(rickerH(Adult.CPUE.t.1,H.t.1,a,b,f)),
sdlog=exp(logsdlog)),
          # method="L-BFGS-B",lower=c(a=0.01,b=0.01,f=0.01,logsdlog=-10),
          start=list(a=1,b=1,f=1,logsdlog=0),
          data=wf)
coef(fit)
slnorm <- function(meanlog,sdlog) {
  list(title="Log-normal",
       median=exp(meanlog),
       mean=exp(meanlog+sdlog^2/2))
}

# predframe <- data.frame(Adult.CPUE.t.1=seq(0,5.5,length=51))
predframe$mle2_f <- predict(fit,newdata=predframe)
# predframe$mle_med <- predict(fit,newdata=predframe,location="median")

predframe_melt <- reshape2::melt(predframe,id.var="Adult.CPUE.t.1",variable.name="model",
value.name="YOY.CPUE")

ggplot(wf,aes(Adult.CPUE.t.1,YOY.CPUE))+
  geom_point()+
  # geom_smooth(method="glm",formula=y~x+offset(log(x)),
method.args=list(family=Gamma(link="log")))+
  geom_line(data=predframe_melt,aes(color=model,shape=model),size=2)+
  theme_bw()

SR model w/ age diversity metric

来源：https://stackoverflow.com/questions/57153916/how-do-i-plot-a-mle2-fit-of-a-model-in-ggplot2-along-with-the-data

标签

ggplot2

mle