When scale the data, why the train dataset use 'fit' and 'transform', but the test dataset only use 'transform'?

后端 未结 7 1927
悲&欢浪女
悲&欢浪女 2021-02-01 03:32

When scale the data, why the train dataset use \'fit\' and \'transform\', but the test dataset only use \'transform\'?

SAMPLE_COUNT = 5000
TEST_COUNT = 20000
see         


        
7条回答
  •  挽巷
    挽巷 (楼主)
    2021-02-01 04:10

    We use fit_transform() on the train data so that we learn the parameters of scaling on the train data and in the same time we scale the train data. We only use transform() on the test data because we use the scaling paramaters learned on the train data to scale the test data.

    This is the standart procedure to scale. You always learn your scaling parameters on the train and then use them on the test. Here is an article that explane it very well : https://sebastianraschka.com/faq/docs/scale-training-test.html

提交回复
热议问题