How could I create a table that includes the percentages for each node in the plot below?
library(rpart)
library(rattle)
library(rpart.plot)
library(RColorBrewer)
fit <- rpart(Species ~ ., data=iris, method="class")
fancyRpartPlot(fit)
It results in this plot:
I would like to output a table with species as the first column and the associated percent at each node in a second column. A second iteration of the table would exclude the first node (100%) and also remove duplicates by retaining the row that contains a higher percentage.
After picking through the "rpart" documentation I'm still unable to figure out how to create this table. Please let me know what you think.
Thank you for your time.
The where element of the rpart-object is the predicted class for the terminal nodes. You can get this in a table with:
> iris$where <- fit$where
> with(iris, table(Species, where))
where
Species 2 4 5
setosa 50 0 0
versicolor 0 49 1
virginica 0 5 45
I'm guessing you want the column sums divided by the total counts?
> 100*colSums(with(iris, table(Species, where)) )/150
2 4 5
33.33333 36.00000 30.66667
来源:https://stackoverflow.com/questions/27727149/how-to-get-percentages-from-decision-tree-for-each-node