Difference between using train_test_split and cross_val_score in sklearn.cross_validation

后端 未结 2 659
抹茶落季
抹茶落季 2021-02-04 10:40

I have a matrix with 20 columns. The last column are 0/1 labels.

The link to the data is here.

I am trying to run random forest on the dataset, using cross valid

2条回答
  •  清歌不尽
    2021-02-04 11:10

    The answer is what @KCzar pointed. Just want to note the easiest way I found to randomize data(X and y with the same index shuffling) is as following:

    p = np.random.permutation(len(X))
    X, y = X[p], y[p]
    

    source: Better way to shuffle two numpy arrays in unison

提交回复
热议问题