问题
I think during my research to solve this question I came pretty close. I am looking for something like this for the C5.0 package.
The method provided in the SO answer works with a party
object. However the C5.0 package does not support as.party
. On my further research I found this comment that the maintainer of the C5.0 package already programmed the function, but did not export it.
I thought great this should work, but unfortunately the suggested function C50:::as.party.C5.0(mod1)
throws the error:
error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class ""function"" to a data.frame
Any suggestions to solve this error appreciated. Let's use the following example:
library(C50)
p = iris[1:4]
t = factor(iris$Species)
model = C50::C5.0(p,t)
#summary(model)
modParty = C50:::as.party.C5.0(model)
回答1:
The problem seems to occur when using the default method of C5.0()
as opposed to the formula method. If you use the latter then the as.party()
conversion works successfully and you can apply all methods for that:
model <- C5.0(Species ~ ., data = iris)
modParty <- C50:::as.party.C5.0(model)
modParty
## Model formula:
## Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width
##
## Fitted party:
## [1] root
## | [2] Petal.Length <= 1.9: setosa (n = 50, err = 0.0%)
## | [3] Petal.Length > 1.9
## | | [4] Petal.Width <= 1.7
## | | | [5] Petal.Length <= 4.9: versicolor (n = 48, err = 2.1%)
## | | | [6] Petal.Length > 4.9: virginica (n = 6, err = 33.3%)
## | | [7] Petal.Width > 1.7: virginica (n = 46, err = 2.2%)
##
## Number of inner nodes: 3
## Number of terminal nodes: 4
And then a selection of predicted paths as in the other discussion you linked:
pathpred(modParty)[c(1, 51, 101), ]
## response prob.setosa prob.versicolor prob.virginica
## 1 setosa 1.00000000 0.00000000 0.00000000
## 51 versicolor 0.00000000 0.97916667 0.02083333
## 101 virginica 0.00000000 0.02173913 0.97826087
## rule
## 1 Petal.Length <= 1.9
## 51 Petal.Length > 1.9 & Petal.Width <= 1.7 & Petal.Length <= 4.9
## 101 Petal.Length > 1.9 & Petal.Width > 1.7
I'm not sure why the method does not work for the default interface. But probably it's more difficult to set up the required model frame. You might consider asking the C50
maintainer about this, though.
来源:https://stackoverflow.com/questions/37393329/r-c5-0-get-rule-and-probability-for-every-leaf