Save MinMaxScaler model in sklearn

前端未结

关注

 5  1498

I\'m using the MinMaxScaler model in sklearn to normalize the features of a model.

training_set = np.random.rand(4,4)*10
training_set

       [[


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  温柔的废话        
                
              
                            
                2020-12-23 14:36
              
            
            
                                                                       
The best way to do this is to create an ML pipeline like the following:

from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import MinMaxScaler
from sklearn.externals import joblib


pipeline = make_pipeline(MinMaxScaler(),YOUR_ML_MODEL() )

model = pipeline.fit(X_train, y_train)


Now you can save it to a file:

joblib.dump(model, 'filename.mod') 


Later you can load it like this:

model = joblib.load('filename.mod')

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  温柔的废话        
                
              
                            
                2020-12-23 14:37
              
            
            
                                                                       
You can use pickle, to save the scaler:

import pickle
scalerfile = 'scaler.sav'
pickle.dump(scaler, open(scalerfile, 'wb'))


Load it back:

import pickle
scalerfile = 'scaler.sav'
scaler = pickle.load(open(scalerfile, 'rb'))
test_scaled_set = scaler.transform(test_set)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  不知归路        
                
              
                            
                2020-12-23 14:38
              
            
            
                                                                       
Even better than pickle (which creates much larger files than this method), you can use sklearn's built-in tool:

from sklearn.externals import joblib
scaler_filename = "scaler.save"
joblib.dump(scaler, scaler_filename) 

# And now to load...

scaler = joblib.load(scaler_filename) 


Note: sklearn.externals.joblib is deprecated. Install and use the pure joblib instead
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  天命终不由人        
                
              
                            
                2020-12-23 14:45
              
            
            
                                                                       
So I'm actually not an expert with this but from a bit of research and a few helpful links, I think pickle and sklearn.externals.joblib are going to be your friends here.

The package pickle lets you save models or "dump" models to a file. 

I think this link is also helpful. It talks about creating a persistence model. Something that you're going to want to try is:

# could use: import pickle... however let's do something else
from sklearn.externals import joblib 

# this is more efficient than pickle for things like large numpy arrays
# ... which sklearn models often have.   

# then just 'dump' your file
joblib.dump(clf, 'my_dope_model.pkl') 


Here is where you can learn more about the sklearn  externals.

Let me know if that doesn't help or I'm not understanding something about your model.

Note: sklearn.externals.joblib is deprecated. Install and use the pure joblib instead
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  佛祖请我去吃肉        
                
              
                            
                2020-12-23 14:47
              
            
            
                                                                       
Just a note that sklearn.externals.joblib has been deprecated and is superseded by plain old joblib, which can be installed with pip install joblib:

import joblib
joblib.dump(my_scaler, 'scaler.gz')
my_scaler = joblib.load('scaler.gz')


Note that file extensions can be anything, but if it is one of ['.z', '.gz', '.bz2', '.xz', '.lzma'] then the corresponding compression protocol will be used. Docs for joblib.dump() and joblib.load() methods.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复