What is “random-state” in sklearn.model_selection.train_test_split example?

后端 未结 6 2134
鱼传尺愫
鱼传尺愫 2020-12-14 10:47

I am really new to machine learning,i was going through some example on sklearn

Can someone explain me what really \"Random-state\" means in below e

相关标签:
6条回答
  • 2020-12-14 11:02

    If the random_state is always fixed (42), doesn't that go against the Machine Learning perspective, in that it supposed to use randomness to help it discover the best possible outcomes?

    For debugging I understand a fixed randomizer.. But when doing the "real" training should we use a truly random seed?

    0 讨论(0)
  • 2020-12-14 11:07

    If you don't specify the random_state in the code, then every time you run(execute) your code a new random value is generated and the train and test datasets would have different values each time.

    However, if a fixed value is assigned like random_state = 0 or 1 or 42 or any other integer then no matter how many times you execute your code the result would be the same .i.e, same values in train and test datasets.

    0 讨论(0)
  • 2020-12-14 11:10

    Random state ensures that the splits that you generate are reproducible. Scikit-learn uses random permutations to generate the splits. The random state that you provide is used as a seed to the random number generator. This ensures that the random numbers are generated in the same order.

    0 讨论(0)
  • 2020-12-14 11:14

    The random state is simply the lot number of the set generated randomly in any operation. We can specify this lot number whenever we want the same set again.

    0 讨论(0)
  • 2020-12-14 11:15

    When the Random_state is not defined in the code for every run train data will change and accuracy might change for every run. When the Random_state = " constant integer" is defined then train data will be constant For every run so that it will make easy to debug.

    0 讨论(0)
  • 2020-12-14 11:19

    Isn't that obvious? 42 is the Answer to the Ultimate Question of Life, the Universe, and Everything.

    On a serious note, random_state simply sets a seed to the random generator, so that your train-test splits are always deterministic. If you don't set a seed, it is different each time.

    Relevant documentation:

    random_state : int, RandomState instance or None, optional (default=None)
    If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

    0 讨论(0)
提交回复
热议问题