I am trying to figure out how to plot the profile likelihood curve of a GLM parameter with 95% pLCI\'s on the same plot. The example I have been trying with is below. The plot
Regenerate your example:
clotting <- data.frame(
u = c(5,10,15,20,30,40,60,80,100),
lot1 = c(118,58,42,35,27,25,21,19,18),
lot2 = c(69,35,26,21,18,16,13,12,12))
glm2 <- glm(lot2 ~ log(u), data=clotting, family=Gamma)
The profile.glm
function actually lives in the MASS
package:
library(MASS)
prof<-profile(glm2)
In order to figure out what profile.glm
and plot.profile
are doing, see ?profile.glm
and ?plot.profile
. However, in order to dig into the profile
object it may also be useful to examine the code of MASS:::profile.glm
and MASS:::plot.profile
... basically, what these tell you is that profile
is returning the signed square root of the difference between the deviance and the minimum deviance, scaled by the dispersion parameter. The reason that this is done is so that the profile for a perfectly quadratic profile will appear as a straight line (it's much easier to detect deviations from a straight line than from a parabola by eye).
The other thing that may be useful to know is how the profile is stored. Basically, it's a list of data frames (one for each parameter profiled), except that the individual data frames are a little bit weird (containing one vector component and one matrix component).
> str(prof)
List of 2
$ (Intercept):'data.frame': 12 obs. of 3 variables:
..$ tau : num [1:12] -3.557 -2.836 -2.12 -1.409 -0.702 ...
..$ par.vals: num [1:12, 1:2] -0.0286 -0.0276 -0.0267 -0.0258 -0.0248 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:2] "(Intercept)" "log(u)"
..$ dev : num [1:12] 0.00622 0.00753 0.00883 0.01012 0.0114 ...
$ log(u) :'data.frame': 12 obs. of 2 variables:
..$ tau : num [1:12] -3.516 -2.811 -2.106 -1.403 -0.701 ...
..$ par.vals: num [1:12, 1:2] -0.0195 -0.0204 -0.0213 -0.0222 -0.023 ...
.. ..- attr(*, "dimnames")=List of 2
It also contains attributes summary
and original.fit
that you can use to recover the dispersion and minimum deviance:
disp <- attr(prof,"summary")$dispersion
mindev <- attr(prof,"original.fit")$deviance
Now reverse the transformation for parameter 1:
dev1 <- prof[[1]]$tau^2
dev2 <- dev1*disp+mindev
Plot:
plot(prof[[1]][,1],dev2,type="b")
(This is the plot of the deviance. You can multiply by 0.5 to get the negative log-likelihood, or -0.5 to get the log-likelihood ...)
edit: some more general functions to transform the profile into a useful format for lattice
/ggplot
plotting ...
tmpf <- function(x,n) {
data.frame(par=n,tau=x$tau,
deviance=x$tau^2*disp+mindev,
x$par.vals,check.names=FALSE)
}
pp <- do.call(rbind,mapply(tmpf,prof,names(prof),SIMPLIFY=FALSE))
library(reshape2)
pp2 <- melt(pp,id.var=1:3)
pp3 <- subset(pp2,par==variable,select=-variable)
Now plot it with lattice:
library(lattice)
xyplot(deviance~value|par,type="b",data=pp3,
scales=list(x=list(relation="free")))
Or with ggplot2:
library(ggplot2)
ggplot(pp3,aes(value,deviance))+geom_line()+geom_point()+
facet_wrap(~par,scale="free_x")
FYI, for fun, I took the above and whipped it together into a single function using purrr::imap_dfr
as I couldn't find a package that implements the above.
get_profile_glm <- function(aglm){
prof <- MASS:::profile.glm(aglm)
disp <- attr(prof,"summary")$dispersion
purrr::imap_dfr(prof, .f = ~data.frame(par = .y,
deviance=.x$z^2*disp+aglm$deviance,
values = as.data.frame(.x$par.vals)[[.y]],
stringsAsFactors = FALSE))
}
Works great!
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
print(d.AD <- data.frame(treatment, outcome, counts))
glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
ggplot(get_profile_glm(aglm), aes(x = values, y = deviance)) +
geom_point() +
geom_line() +
facet_wrap(~par, scale = "free_x")