I am interested in updating existing layer parameters in Keras (not removing a layer and inserting a new one instead, rather just modifying existing parameters).
Well, if you would like to create the architecture of a new model based on an existing model, though with some modifications, you can use to_json
and model_from_json()
functions. Here is an example:
model = Sequential()
model.add(Conv2D(10, (3,3), input_shape=(100,100,3)))
model.add(Conv2D(40, (3,3)))
Model summary:
Layer (type) Output Shape Param #
conv2d_12 (Conv2D) (None, 98, 98, 10) 280
conv2d_13 (Conv2D) (None, 96, 96, 40) 3640
Total params: 3,920
Trainable params: 3,920
Non-trainable params: 0
Now we modify the number of filters of the first layer and create a new model based on the modified architecture:
from keras.models import model_from_json
model.layers[0].filters *= 2
new_model = model_from_json(model.to_json())
New model summary:
Layer (type) Output Shape Param #
conv2d_12 (Conv2D) (None, 98, 98, 20) 560
conv2d_13 (Conv2D) (None, 96, 96, 40) 7240
Total params: 7,800
Trainable params: 7,800
Non-trainable params: 0
You can also modify the output of model.to_json()
directly without modifying the model instance.
You can easily use get_weights()
method to get the current weights of the convolution layer. It would return a list of two numpy arrays. The first one corresponds to filter weights and the second one corresponds to bias parameters. Then you can use set_weights()
method to set the new weights:
conv_layer = model.layers[random_conv_index]
weights = conv_layer.get_weights()
weights[0] *= factor # multiply filter weights by `factor`
As a side note, the filters
attribute of a convolution layer which you have used in your code corresponds to the number of filters in this layer and not their weights.
Another solution is to again set the attributes of layer. For instance if someone wants to change the kernel initializer of convolutional layers, below is the small example:
img_input = tf.keras.Input(shape=(256,256,1))
x = tf.keras.layers.Conv2D(64, (7, 7), padding='same', use_bias=False, kernel_initializer=None,name='conv')(img_input)
model = tf.keras.Model(inputs=[img_input], outputs=[x], name='resnext')
for layer in model.layers:
{'batch_input_shape': (None, 256, 256, 1), 'dtype': 'float32', 'sparse': False, 'name': 'input_1'}
{'name': 'conv2d', 'trainable': True, 'dtype': 'float32', 'filters': 64, 'kernel_size': (7, 7), 'strides': (1, 1), 'padding': 'same', 'data_format': 'channels_last', 'dilation_rate': (1, 1), 'activation': 'linear', 'use_bias': False, 'kernel_initializer': None, 'bias_initializer': {'class_name': 'Zeros', 'config': {'dtype': 'float32'}}, 'kernel_regularizer': None, 'bias_regularizer': None, 'activity_regularizer': None, 'kernel_constraint': None, 'bias_constraint': None}
after setting:
init1 = tf.keras.initializers.TruncatedNormal()
for layer in model.layers:
if hasattr(layer, 'kernel_initializer'):
setattr(layer, 'kernel_initializer', init1)
for layer in model.layers:
{'batch_input_shape': (None, 256, 256, 1), 'dtype': 'float32', 'sparse': False, 'name': 'input_1'}
{'name': 'conv2d', 'trainable': True, 'dtype': 'float32', 'filters': 64, 'kernel_size': (7, 7), 'strides': (1, 1), 'padding': 'same', 'data_format': 'channels_last', 'dilation_rate': (1, 1), 'activation': 'linear', 'use_bias': False, 'kernel_initializer': {'class_name': 'TruncatedNormal', 'config': {'mean': 0.0, 'stddev': 0.05, 'seed': None, 'dtype': 'float32'}}, 'bias_initializer': {'class_name': 'Zeros', 'config': {'dtype': 'float32'}}, 'kernel_regularizer': None, 'bias_regularizer': None, 'activity_regularizer': None, 'kernel_constraint': None, 'bias_constraint': None}
The kernel initializer has been set