How can I compare weights of different Keras models?

前端 未结 1 668
攒了一身酷
攒了一身酷 2021-02-10 16:20

I\'ve saved numbers of models in .h5 format. I want to compare their characteristics such as weight. I don\'t have any Idea how I can appropriately compare them specially in the

1条回答
  •  独厮守ぢ
    2021-02-10 17:10

    Weight-introspection is a fairly advanced endeavor, and requires model-specific treatment. Visualizing weights is a largely technical challenge, but what you do with that information's a different matter - I'll address largely the former, but touch upon the latter.

    Update: I also recommend See RNN for weights, gradients, and activations visualization.


    Visualizing weights: one approach is as follows:

    1. Retrieve weights of layer of interest. Ex: model.layers[1].get_weights()
    2. Understand weight roles and dimensionality. Ex: LSTMs have three sets of weights: kernel, recurrent, and bias, each serving a different purpose. Within each weight matrix are gate weights - Input, Cell, Forget, Output. For Conv layers, the distinction's between filters (dim0), kernels, and strides.
    3. Organize weight matrices for visualization in a meaningful manner per (2). Ex: for Conv, unlike for LSTM, feature-specific treatment isn't really necessary, and we can simply flatten kernel weights and bias weights and visualize them in a histogram
    4. Select visualization method: histogram, heatmap, scatterplot, etc - for flattened data, a histogram is the best bet

    Interpreting weights: a few approaches are:

    • Sparsity: if weight norm ("average") is low, the model is sparse. May or may not be beneficial.
    • Health: if too many weights are zero or near-zero, it's a sign of too many dead neurons; this can be useful for debugging, as once a layer's in such a state, it usually does not revert - so training should be restarted
    • Stability: if weights are changing greatly and quickly, or if there are many high-valued weights, it may indicate impaired gradient performance, remedied by e.g. gradient clipping or weight constraints

    Model comparison: there isn't a way for simply looking at two weights from separate models side-by-side and deciding "this is the better one"; analyze each model separately, for example as above, then decide which one's ups outweigh downs.

    The ultimate tiebreaker, however, will be validation performance - and it's also the more practical one. It goes as:

    1. Train model for several hyperparameter configurations
    2. Select one with best validation performance
    3. Fine-tune that model (e.g. via further hyperparameter configs)

    Weight visualization should be mainly kept as a debugging or logging tool - as, put simply, even with our best current understanding of neural networks one cannot tell how well the model will generalize just by looking at the weights.


    Suggestion: also visualize layer outputs - see this answer and sample output at bottom.


    Visual example:

    from tensorflow.keras.layers import Input, Conv2D, Dense, Flatten
    from tensorflow.keras.models import Model
    
    ipt = Input(shape=(16, 16, 16))
    x   = Conv2D(12, 8, 1)(ipt)
    x   = Flatten()(x)
    out = Dense(16)(x)
    
    model = Model(ipt, out)
    model.compile('adam', 'mse')
    
    X = np.random.randn(10, 16, 16, 16)  # toy data
    Y = np.random.randn(10, 16)  # toy labels
    for _ in range(10):
        model.train_on_batch(X, Y)
    
    def get_weights_print_stats(layer):
        W = layer.get_weights()
        print(len(W))
        for w in W:
            print(w.shape)
        return W
    
    def hist_weights(weights, bins=500):
        for weight in weights:
            plt.hist(np.ndarray.flatten(weight), bins=bins)
    
    W = get_weights_print_stats(model.layers[1])
    # 2
    # (8, 8, 16, 12)
    # (12,)
    
    hist_weights(W)
    


    Conv1D outputs visualization: (source)

    0 讨论(0)
提交回复
热议问题