问题
I am trying to fit cumulative link mixed models with the ordinal
package but there is something I do not understand about obtaining the prediction probabilities. I use the following example from the ordinal
package:
library(ordinal)
data(soup)
## More manageable data set:
dat <- subset(soup, as.numeric(as.character(RESP)) <= 24)
dat$RESP <- dat$RESP[drop=TRUE]
m1 <- clmm2(SURENESS ~ PROD, random = RESP, data = dat, link="logistic", Hess = TRUE,doFit=T)
summary(m1)
str(dat)
Now I am trying to get predictions of probabilities for a new dataset
newdata1=data.frame(PROD=factor(c("Ref", "Ref")), SURENESS=factor(c("6","6")))
with
predict(m1, newdata=newdata1)
but I am getting the following error
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
Why am I getting this error? Is there something in the syntax of predict.clmm2()
wrong? Generally which probabilities does does predict.clmm2() output? The Pr(J<j)
or Pr(J=j)
? Could someone point me to information (site, books) material regarding fitting categorical (ordinal) ordinal mixed models specifically with R. From my search in the literature and net, most researchers fit these kind of models with SAS.
回答1:
You did not say what you corrected, but when I use this, I get no error:
newdata1=data.frame(PROD=factor(c("Test", "Test"), levels=levels(dat$PROD)),
SURENESS=factor(c("1","1")) )
predict(m1, newdata=newdata1)
The output from predict.clmm2 with a newdata argument will not make much sense unless you get all the factor levels aligned so they are in the agreement with the input data:
> newdata1=data.frame(
PROD=factor(c("Ref", "Test"), levels=levels(dat$PROD)),
SURENESS=factor(c("1","1")) )
> predict(m1, newdata=newdata1)
[1] 1 1 1 1 1 1 1 1 1 1 1 1
Not very interesting. The prediction is for an outcome with only one level to have a probability of 1 of being in that level. (A vacuous prediction.) But recreating the structure of the original ordered outcomes is more meaningful:
> newdata1=data.frame(
PROD=factor(c("Ref", "Test"), levels=levels(dat$PROD)),
SURENESS=factor(c("1","1"), levels=levels(dat$SURENESS)) , )
> predict(m1, newdata=newdata1)
[1] 0.20336975 0.03875713
You can answer the question in the comments by assembling all the predictions for various levels:
> sapply(as.character(1:6), function(x){ newdata1=data.frame(PROD=factor(c("Ref", "Test"), levels=levels(dat$PROD)), SURENESS=factor(c(x,x), levels=levels(dat$SURENESS)) );predict(m1, newdata=newdata1)})
1 2 3 4 5 6
[1,] 0.20336975 0.24282083 0.10997039 0.07010327 0.1553313 0.2184045
[2,] 0.03875713 0.07412618 0.05232823 0.04405965 0.1518367 0.6388921
> out <- .Last.value
> rowSums(out)
[1] 1 1
The probabilities are Pr(J=j|X=x & Random=all)
.
来源:https://stackoverflow.com/questions/17491503/probability-predictions-with-cumulative-link-mixed-models