Improving model training speed in caret (R)

后端未结

关注

 3  1882

轮回少年 2021-01-31 12:30

I have a dataset consisting of 20 features and roughly 300,000 observations. I\'m using caret to train model with doParallel and four cores. Even training on 10% of my data ta

3条回答

栀梦 (楼主)

2021-01-31 12:37
@phiver hits the nail on the head but, for this situation, there are a few things to suggest:
- make sure that you are not exhausting your system memory by using parallel processing. You are making X extra copies of the data in memory when using X workers.
- with a class imbalance, additional sampling can help. Downsampling might help improve performance and take less time.
- use different libraries. ranger instead of randomForest, xgboost or C5.0 instead of gbm. You should realize that ensemble methods are fitting a ton of constituent models and a bound to take a while to fit.
- the package has a racing-type algorithm for tuning parameters in less time
- the development version on github has random search methods for the models with a lot of tuning parameters.
Max
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...