I have a pandas dataframe and I wish to divide it to 3 separate sets. I know that using train_test_split from sklearn.cross_validation
, one can divide the data
def train_val_test_split(X, y, train_size, val_size, test_size):
X_train_val, X_test, y_train_val, y_test = train_test_split(X, y, test_size = test_size)
relative_train_size = train_size / (val_size + train_size)
X_train, X_val, y_train, y_val = train_test_split(X_train_val, y_train_val,
train_size = relative_train_size, test_size = 1-relative_train_size)
return X_train, X_val, X_test, y_train, y_val, y_test
Here we split data 2 times with sklearn's train_test_split