问题
I'm trying to copy pre-trained BN weights from a pytorch model to its equivalent Keras model but I keep getting different outputs.
I read Keras and Pytorch BN documentation and I think that the difference lies in the way they calculate the "mean" and "var".
Pytorch:
The mean and standard-deviation are calculated per-dimension over the mini-batches
source: Pytorch BatchNorm
Thus, they average over samples.
Keras:
axis: Integer, the axis that should be normalized (typically the features axis). For instance, after a Conv2D layer with data_format="channels_first", set axis=1 in BatchNormalization.
source: Keras BatchNorm
and here they average over the features (channels)
What's the right way? How to transfer BN weights between the models?
回答1:
you can retrieve moving_mean
and moving_variance
from running_mean
and running_var
attributes of pytorch module
# torch weights, bias, running_mean, running_var corresponds to keras gamma, beta, moving mean, moving average
weights = torch_module.weight.numpy()
bias = torch_module.bias.numpy()
running_mean = torch_module.running_mean.numpy()
running_var = torch_module.running_var.numpy()
keras_module.set_weights([weights, bias, running_mean, running_var])
来源:https://stackoverflow.com/questions/54650587/pytorch-batchnorm-layer-different-from-keras-batchnorm