问题
I am running into some labels issue when using rpart in R.
Here's my situation.
I'm working on a dataset with categorical variables, here's an extract of my data
head(Dataset)
Entity IL CP TD Budget
2 1 3 2 250
5 2 2 1 663
6 1 2 3 526
2 3 1 2 522
when I plot my decision tree adding the labels, using
plot(tree)
text(tree)
I get wrong labels : for Entity, I get "abcd"
Why do I get that and how can I fix that ?
Thank you for your help
回答1:
By default plot.rpart
will just label the levels of factor variables with letters
, the first level will be a
, second b
and so on. Example:
library(rpart)
library(ggplot2) #for the data
data("diamonds")
df <- diamonds[1:2000,]
fit <- rpart(price ~ color + cut + clarity, data = df)
plot(fit)
text(fit)
In my opinion instead of customizing this plot use the rpart plotting dedicated package:
library(rpart.plot)
prp(fit)
it has many customization options (example):
prp(fit,
type = 4,
extra = 101,
fallen.leaves = T,
box.palette = colorRampPalette(c("red", "white", "green3"))(10),
round = 2,
branch.lty = 2,
branch.lwd = 1,
space = -1,
varlen = 0,
faclen = 0)
Another options is:
library(rattle)
fancyRpartPlot(fit,
type = 4)
which uses prp
internally with different defaults.
来源:https://stackoverflow.com/questions/50602548/wrong-labels-in-rpart-tree