How to avoid overfitting on a simple feed forward network

后端未结

关注

 3  507

渐次进展 2021-02-02 18:32

Using the pima indians diabetes dataset I\'m trying to build an accurate model using Keras. I\'ve written the following code:

# Visualize training history
from k


      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   -上瘾入骨i
                                             
                
                
                (楼主)
            
              
              
                2021-02-02 18:54
              

            
            
                        


The first example gave a validation accuracy > 75% and the second one gave an accuracy of < 65% and if you compare the losses for epochs below 100, its  less than < 0.5 for the first one and the second one was > 0.6. But how is the second case better?.  

The second one to me is a case of under-fitting:  the model doesnt have enough capacity to learn. While the first case has a problem of over-fitting because its training was not stopped when overfitting started (early stopping).  If the training was stopped at say 100 epoch, it would be a far better model compared between the two. 

The goal should be to obtain small prediction error in unseen data and for that you increase the capacity of the network till a point beyond which overfitting starts to happen. 

So how to avoid over-fitting in this particular case? Adopt early stopping.

CODE CHANGES: To include early stopping and input scaling.

 # input scaling
 scaler = StandardScaler()
 X = scaler.fit_transform(X)

 # Early stopping  
 early_stop = EarlyStopping(monitor='val_loss', min_delta=0, patience=3, verbose=1, mode='auto')

 # create model - almost the same code
 model = Sequential()
 model.add(Dense(12, input_dim=8, activation='relu', name='first_input'))
 model.add(Dense(500, activation='relu', name='first_hidden'))
 model.add(Dropout(0.5, name='dropout_1'))
 model.add(Dense(8, activation='relu', name='second_hidden'))
 model.add(Dense(1, activation='sigmoid', name='output_layer')))

 history = model.fit(X, Y, validation_split=0.33, epochs=1000, batch_size=10, verbose=0, callbacks=[tb, early_stop])


The Accuracy and loss graphs:


    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复