Wrong labels in rpart tree

限于喜欢 提交于 2020-04-16 01:59:45

问题


I am running into some labels issue when using rpart in R.

Here's my situation.

I'm working on a dataset with categorical variables, here's an extract of my data

head(Dataset)
Entity  IL  CP  TD  Budget 
  2      1   3   2     250
  5      2   2   1     663
  6      1   2   3     526 
  2      3   1   2     522

when I plot my decision tree adding the labels, using

plot(tree) 
text(tree)

I get wrong labels : for Entity, I get "abcd"

Why do I get that and how can I fix that ?

Thank you for your help


回答1:


By default plot.rpart will just label the levels of factor variables with letters, the first level will be a, second b and so on. Example:

library(rpart)
library(ggplot2) #for the data

data("diamonds")    
df <- diamonds[1:2000,]

fit <- rpart(price ~ color + cut + clarity, data = df)
plot(fit)
text(fit)

In my opinion instead of customizing this plot use the rpart plotting dedicated package:

library(rpart.plot)
prp(fit)

it has many customization options (example):

prp(fit,
    type = 4,
    extra = 101,
    fallen.leaves = T,
    box.palette = colorRampPalette(c("red", "white", "green3"))(10),
    round = 2,
    branch.lty = 2,
    branch.lwd = 1,
    space = -1,
    varlen = 0,
    faclen = 0)

Another options is:

library(rattle)
fancyRpartPlot(fit,
               type = 4)

which uses prp internally with different defaults.



来源:https://stackoverflow.com/questions/50602548/wrong-labels-in-rpart-tree

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!