Probability predictions with cumulative link mixed models

问题

I am trying to fit cumulative link mixed models with the ordinal package but there is something I do not understand about obtaining the prediction probabilities. I use the following example from the ordinal package:

   library(ordinal)
data(soup)
## More manageable data set:
dat <- subset(soup, as.numeric(as.character(RESP)) <=  24)
dat$RESP <- dat$RESP[drop=TRUE]
m1 <- clmm2(SURENESS ~ PROD, random = RESP, data = dat, link="logistic",  Hess = TRUE,doFit=T)
summary(m1)
str(dat)

Now I am trying to get predictions of probabilities for a new dataset

newdata1=data.frame(PROD=factor(c("Ref", "Ref")), SURENESS=factor(c("6","6")))

with

predict(m1, newdata=newdata1)

but I am getting the following error

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

Why am I getting this error? Is there something in the syntax of predict.clmm2() wrong? Generally which probabilities does does predict.clmm2() output? The Pr(J<j) or Pr(J=j)? Could someone point me to information (site, books) material regarding fitting categorical (ordinal) ordinal mixed models specifically with R. From my search in the literature and net, most researchers fit these kind of models with SAS.

回答1:

You did not say what you corrected, but when I use this, I get no error:

newdata1=data.frame(PROD=factor(c("Test", "Test"), levels=levels(dat$PROD)), 
                    SURENESS=factor(c("1","1")) )
predict(m1, newdata=newdata1)

The output from predict.clmm2 with a newdata argument will not make much sense unless you get all the factor levels aligned so they are in the agreement with the input data:

> newdata1=data.frame(
                PROD=factor(c("Ref", "Test"), levels=levels(dat$PROD)), 
                SURENESS=factor(c("1","1")) )
> predict(m1, newdata=newdata1)
 [1] 1 1 1 1 1 1 1 1 1 1 1 1

Not very interesting. The prediction is for an outcome with only one level to have a probability of 1 of being in that level. (A vacuous prediction.) But recreating the structure of the original ordered outcomes is more meaningful:

> newdata1=data.frame(
             PROD=factor(c("Ref", "Test"), levels=levels(dat$PROD)), 
             SURENESS=factor(c("1","1"), levels=levels(dat$SURENESS)) , )
> predict(m1, newdata=newdata1)
[1] 0.20336975 0.03875713

You can answer the question in the comments by assembling all the predictions for various levels:

> sapply(as.character(1:6), function(x){ newdata1=data.frame(PROD=factor(c("Ref", "Test"), levels=levels(dat$PROD)), SURENESS=factor(c(x,x), levels=levels(dat$SURENESS))  );predict(m1, newdata=newdata1)})
              1          2          3          4         5         6
[1,] 0.20336975 0.24282083 0.10997039 0.07010327 0.1553313 0.2184045
[2,] 0.03875713 0.07412618 0.05232823 0.04405965 0.1518367 0.6388921
> out <- .Last.value
> rowSums(out)
[1] 1 1

The probabilities are Pr(J=j|X=x & Random=all).

来源：https://stackoverflow.com/questions/17491503/probability-predictions-with-cumulative-link-mixed-models

标签

regression

ordinal

mixed-models