Can anyone tell me why we set random state to zero in splitting train and test set.
X_train, X_test, y_train, y_test = \\
train_test_split(X, y, test_size
We used the random_state parameter for reproducibility of the initial shuffling of training datasets after each epoch.
If you don't specify the random_state in your code, then every time you run(execute) your code a new random value is generated and the train and test datasets would have different values each time.
However, if a fixed value is assigned like random_state = 0 or 1 or 42 then no matter how many times you execute your code the result would be the same .i.e, same values in train and test datasets.
random_state is None by default which means every time when you run your program you will get different output because of splitting between train and test varies within.
random_state = any int value means every time when you run your program you will get tehe same output because of splitting between train and test does not varies within.