How to get training & validation loss of Keras scikit-learn wrapper in cross validation?

前端 未结 1 2064
心在旅途
心在旅途 2021-01-21 20:33

I know that model.fit in keras returns a callbacks.History object where we can get loss and other metrics from it as follows.

...
train_history = mo         


        
相关标签:
1条回答
  • 2021-01-21 21:27

    As mentioned explicitly in the documentation, cross_val_score includes a scoring argument, which is

    Similar to cross_validate but only a single metric is permitted.

    hence it cannot be used for returning all the loss & metric info of Keras model.fit().

    The scikit-learn wrapper of Keras is meant as a convenience, provided that you are not really interested in all the underlying details (such as training & validation loss and accuracy). If this is not the case, you should revert to using Keras directly. Here is how you could do that using the example you have linked to and elements of this answer of mine:

    import numpy as np
    from keras import models, layers
    from sklearn.datasets import make_classification
    from sklearn.model_selection import KFold
    
    np.random.seed(0)
    
    # Number of features
    number_of_features = 100
    
    # Generate features matrix and target vector
    features, target = make_classification(n_samples = 10000,
                                           n_features = number_of_features,
                                           n_informative = 3,
                                           n_redundant = 0,
                                           n_classes = 2,
                                           weights = [.5, .5],
                                           random_state = 0)
    
    def create_network():
        network = models.Sequential()
        network.add(layers.Dense(units=16, activation='relu', input_shape=(number_of_features,)))
        network.add(layers.Dense(units=16, activation='relu'))
        network.add(layers.Dense(units=1, activation='sigmoid'))
    
        network.compile(loss='binary_crossentropy', 
                        optimizer='rmsprop', 
                        metrics=['accuracy']) 
    
        return network
    
    n_splits = 3
    kf = KFold(n_splits=n_splits, shuffle=True)
    
    loss = []
    acc = []
    val_loss = []
    val_acc = []
    
    # cross validate:
    for train_index, val_index in kf.split(features):
        model = create_network()
        hist = model.fit(features[train_index], target[train_index],
                         epochs=10,
                         batch_size=100,
                         validation_data = (features[val_index], target[val_index]),
                         verbose=0)
        loss.append(hist.history['loss'])
        acc.append(hist.history['acc'])
        val_loss.append([hist.history['val_loss']])
        val_acc.append(hist.history['val_acc'])
    

    After which, for example loss will be:

    [[0.7251979386058971,
      0.6640552306833333,
      0.6190941931069023,
      0.5602273066015956,
      0.48771809028534785,
      0.40796665995284814,
      0.33154681897220617,
      0.2698465999525444,
      0.227492357244586,
      0.1998490962115201],
     [0.7109123742507104,
      0.674812126485093,
      0.6452083222258479,
      0.6074533335751673,
      0.5627432800365635,
      0.51291748379345,
      0.45645068427406726,
      0.3928780094229408,
      0.3282097149542538,
      0.26993170230619656],
     [0.7191790426458682,
      0.6618405645963258,
      0.6253172250296091,
      0.5855853647883192,
      0.5438901918195831,
      0.4999895181964501,
      0.4495182811042725,
      0.3896359298090465,
      0.3210068798340545,
      0.25932698793518183]]
    

    i.e. a list of n_splits lists (here 3), each one of which contains the training loss for each epoch (here 10). Similarly for the other lists...

    0 讨论(0)
提交回复
热议问题