R C5.0 get rule and probability for every leaf

十年热恋 提交于 2019-12-13 02:11:42

问题


I think during my research to solve this question I came pretty close. I am looking for something like this for the C5.0 package.

The method provided in the SO answer works with a party object. However the C5.0 package does not support as.party. On my further research I found this comment that the maintainer of the C5.0 package already programmed the function, but did not export it.

I thought great this should work, but unfortunately the suggested function C50:::as.party.C5.0(mod1) throws the error:

error in as.data.frame.default(x[[i]], optional = TRUE) : 
    cannot coerce class ""function"" to a data.frame

Any suggestions to solve this error appreciated. Let's use the following example:

library(C50)
p = iris[1:4]
t = factor(iris$Species)
model = C50::C5.0(p,t)
#summary(model)

modParty = C50:::as.party.C5.0(model)

回答1:


The problem seems to occur when using the default method of C5.0() as opposed to the formula method. If you use the latter then the as.party() conversion works successfully and you can apply all methods for that:

model <- C5.0(Species ~ ., data = iris)
modParty <- C50:::as.party.C5.0(model)
modParty
## Model formula:
## Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width
## 
## Fitted party:
## [1] root
## |   [2] Petal.Length <= 1.9: setosa (n = 50, err = 0.0%)
## |   [3] Petal.Length > 1.9
## |   |   [4] Petal.Width <= 1.7
## |   |   |   [5] Petal.Length <= 4.9: versicolor (n = 48, err = 2.1%)
## |   |   |   [6] Petal.Length > 4.9: virginica (n = 6, err = 33.3%)
## |   |   [7] Petal.Width > 1.7: virginica (n = 46, err = 2.2%)
## 
## Number of inner nodes:    3
## Number of terminal nodes: 4

And then a selection of predicted paths as in the other discussion you linked:

pathpred(modParty)[c(1, 51, 101), ]
##       response prob.setosa prob.versicolor prob.virginica
## 1       setosa  1.00000000      0.00000000     0.00000000
## 51  versicolor  0.00000000      0.97916667     0.02083333
## 101  virginica  0.00000000      0.02173913     0.97826087
##                                                              rule
## 1                                             Petal.Length <= 1.9
## 51  Petal.Length > 1.9 & Petal.Width <= 1.7 & Petal.Length <= 4.9
## 101                        Petal.Length > 1.9 & Petal.Width > 1.7

I'm not sure why the method does not work for the default interface. But probably it's more difficult to set up the required model frame. You might consider asking the C50 maintainer about this, though.



来源:https://stackoverflow.com/questions/37393329/r-c5-0-get-rule-and-probability-for-every-leaf

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!