Why do results using caret::train(…, method = “rpart”) differ from rpart::rpart(…)?

拜拜、爱过 提交于 2019-12-04 03:12:11

caret actually does quite a bit more under the hood. In particular, it uses cross-validation to optimize the model hyperparameters. In your case, it tries three values of cp (type modFit and you'll see accuracy results for each value), whereas rpart just uses 0.01 unless you tell it otherwise (see ?rpart.control). The cross-validation will also take longer, especially since caret uses bootstrapping by default.

In order to get similar results, you need to disable cross-validation and specify cp:

modFit <- caret::train(y ~ ., method = "rpart", data = training,
                       trControl=trainControl(method="none"),
                       tuneGrid=data.frame(cp=0.01))

In addition, you should use the same random seed for both models.

That said, the extra functionality that caret provides is a Good Thing, and you should probably just go with caret. If you want to learn more, it's well-documented, and the author has a stellar book, Applied Predictive Modeling.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!