R missing levels in a model.matrix

前端 未结 1 682
遇见更好的自我
遇见更好的自我 2021-01-20 06:13

I am trying to convert a data frame with categorical variables to a model.matrix but am losing levels of variables.

Here\'s my code:

df1 <- data         


        
1条回答
  •  被撕碎了的回忆
    2021-01-20 06:20

    The model matrix is perfectly correct. For factors, the model matrix contains one column less than there are factors: this information is already contained in the (Intercept) column. You are missing this column because you have specified +0 in your model term. Try this:

    mm2 <- model.matrix(~., df1)
    head(mm2)
    

    You will now see the (Intercept) column which encodes "default" information, and now also the first level of var1 is missing in the column names. The (Intercept) represents your observation at the "reference level", which is the combination of first level of each categorical attribute. Any deviation from this reference level is encoded in the var*??? columns, and since your model assumes no interactions between these columns, you get (4 - 1) * 3 var*??? columns plus the (Intercept) column (which is replaced by var1abc in your initial model matrix).

    Unfortunately I lack the precise terms to describe this. Anyone help me out?

    0 讨论(0)
提交回复
热议问题