scikit-learn random state in splitting dataset

后端 未结 9 1921
无人共我
无人共我 2020-12-05 09:12

Can anyone tell me why we set random state to zero in splitting train and test set.

X_train, X_test, y_train, y_test = \\
    train_test_split(X, y, test_size         


        
相关标签:
9条回答
  • 2020-12-05 10:10

    We used the random_state parameter for reproducibility of the initial shuffling of training datasets after each epoch.

    0 讨论(0)
  • 2020-12-05 10:11

    If you don't specify the random_state in your code, then every time you run(execute) your code a new random value is generated and the train and test datasets would have different values each time.

    However, if a fixed value is assigned like random_state = 0 or 1 or 42 then no matter how many times you execute your code the result would be the same .i.e, same values in train and test datasets.

    0 讨论(0)
  • 2020-12-05 10:13

    random_state is None by default which means every time when you run your program you will get different output because of splitting between train and test varies within.

    random_state = any int value means every time when you run your program you will get tehe same output because of splitting between train and test does not varies within.

    0 讨论(0)
提交回复
热议问题