How to avoid overfitting on a simple feed forward network

后端未结

关注

 3  506

Using the pima indians diabetes dataset I\'m trying to build an accurate model using Keras. I\'ve written the following code:

# Visualize training history
from k


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  陌清茗        
                
              
                            
                2021-02-02 18:41
              
            
            
                                                                       
First, try adding some regularization (https://keras.io/regularizers/) like with this code:

model.add(Dense(12, input_dim=12,
            kernel_regularizer=regularizers.l2(0.01),
            activity_regularizer=regularizers.l1(0.01)))


Also, make sure to decrease your network size i.e. you don't need a hidden layer of 500 neurons - try just taking that out to decrease the representation power and maybe even another layer if it's still overfitting. Also, only use relu activation. Maybe also try increasing your dropout rate to something like 0.75 (although it's already high). You probably also don't need to run it for so many epochs - it will just begin to overfit after long enough.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  清歌不尽        
                
              
                            
                2021-02-02 18:42
              
            
            
                                                                       
For a dataset like the Diabetes one you can use a much simpler network. Try to reduce the neurons in your second layer. (Is there a specific reason why you chose tanh as the activation there?).

In addition you simply can add an EarlyStopping callback to your training: https://keras.io/callbacks/
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  -上瘾入骨i        
                
              
                            
                2021-02-02 18:54
              
            
            
                                                                       


The first example gave a validation accuracy > 75% and the second one gave an accuracy of < 65% and if you compare the losses for epochs below 100, its  less than < 0.5 for the first one and the second one was > 0.6. But how is the second case better?.  

The second one to me is a case of under-fitting:  the model doesnt have enough capacity to learn. While the first case has a problem of over-fitting because its training was not stopped when overfitting started (early stopping).  If the training was stopped at say 100 epoch, it would be a far better model compared between the two. 

The goal should be to obtain small prediction error in unseen data and for that you increase the capacity of the network till a point beyond which overfitting starts to happen. 

So how to avoid over-fitting in this particular case? Adopt early stopping.

CODE CHANGES: To include early stopping and input scaling.

 # input scaling
 scaler = StandardScaler()
 X = scaler.fit_transform(X)

 # Early stopping  
 early_stop = EarlyStopping(monitor='val_loss', min_delta=0, patience=3, verbose=1, mode='auto')

 # create model - almost the same code
 model = Sequential()
 model.add(Dense(12, input_dim=8, activation='relu', name='first_input'))
 model.add(Dense(500, activation='relu', name='first_hidden'))
 model.add(Dropout(0.5, name='dropout_1'))
 model.add(Dense(8, activation='relu', name='second_hidden'))
 model.add(Dense(1, activation='sigmoid', name='output_layer')))

 history = model.fit(X, Y, validation_split=0.33, epochs=1000, batch_size=10, verbose=0, callbacks=[tb, early_stop])


The Accuracy and loss graphs:


                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复