I have a matrix with 20 columns. The last column are 0/1 labels.
The link to the data is here.
I am trying to run random forest on the dataset, using cross valid
The answer is what @KCzar pointed. Just want to note the easiest way I found to randomize data(X and y with the same index shuffling) is as following:
X
y
p = np.random.permutation(len(X)) X, y = X[p], y[p]
source: Better way to shuffle two numpy arrays in unison