R: set.seed() results don't match if caret package loaded

大城市里の小女人 提交于 2019-12-23 09:38:33

问题


I am using createFolds() in R (version: 3.3.0) to create train/test partitions. To make results reproducible, I used set.seed() with a seed value of 10. As expected, the results (generated folds) were reproducible.

But once I loaded caret package just after setting the seed. And then used the createFolds function, I found that the created folds were different (although still reproducible).

Specifically, the created folds differ in the following two cases:

Case 1:

library(caret)
set.seed(10)
folds=createFolds(y,k=5,returnTrain=TRUE)

Case 2:

set.seed(10)
library(caret)
folds=createFolds(y,k=5,returnTrain=TRUE)

where y is a vector.

Why could this be happening?


回答1:


The culprit is ggplot2, which is attached when you load caret. It defines an .onAttach function: https://github.com/hadley/ggplot2/blob/master/R/zzz.r

This function is called when the package is attached, see help("ns-hooks"). And within it runif is called thereby advancing the state of the RNG.



来源:https://stackoverflow.com/questions/38465460/r-set-seed-results-dont-match-if-caret-package-loaded

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!