When scale the data, why the train dataset use 'fit' and 'transform', but the test dataset only use 'transform'?

后端未结

关注

 7  1931

悲&欢浪女 2021-02-01 03:32

When scale the data, why the train dataset use \'fit\' and \'transform\', but the test dataset only use \'transform\'?

SAMPLE_COUNT = 5000
TEST_COUNT = 20000
see


      
      
        
          7条回答        

        
                    
            
            
                         
                
              
              
                
                   被撕碎了的回忆
                                             
                
                
                (楼主)
            
              
              
                2021-02-01 04:08
              

            
            
                        
there could be two approaches:
1st approach scale with fit and transform train data, transform only test data
2nd fit and transform the whole set :train + test

if you think about: how will the model handle scaling when goes live?: When new data arrives, new data will behave just like the unseen test data in your backtest.

In the 1st case , new data will will just be scale transformed and your model backtest scaled values remain unchanged. 

But in the 2nd case when new data comes then you will need to fit transform the whole dataset , that means that the backtest scaled values will no longer be the same and then you need to re-train the model..if this task can be done quickly then I guess it is ok
but the 1st case requires less work...

and if there are big differences between scaling in train and test then probably the data is non stationary and ML is probably not a good idea
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它7个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复