Pytorch Batchnorm layer different from Keras Batchnorm

问题

I'm trying to copy pre-trained BN weights from a pytorch model to its equivalent Keras model but I keep getting different outputs.

I read Keras and Pytorch BN documentation and I think that the difference lies in the way they calculate the "mean" and "var".

Pytorch:

The mean and standard-deviation are calculated per-dimension over the mini-batches

source: Pytorch BatchNorm

Thus, they average over samples.

Keras:

axis: Integer, the axis that should be normalized (typically the features axis). For instance, after a Conv2D layer with data_format="channels_first", set axis=1 in BatchNormalization.

source: Keras BatchNorm

and here they average over the features (channels)

What's the right way? How to transfer BN weights between the models?

回答1:

you can retrieve moving_mean and moving_variance from running_mean and running_var attributes of pytorch module

# torch weights, bias, running_mean, running_var corresponds to keras gamma, beta, moving mean, moving average

weights = torch_module.weight.numpy()  
bias = torch_module.bias.numpy()  
running_mean =  torch_module.running_mean.numpy()
running_var =  torch_module.running_var.numpy()

keras_module.set_weights([weights, bias, running_mean, running_var])

来源：https://stackoverflow.com/questions/54650587/pytorch-batchnorm-layer-different-from-keras-batchnorm

标签

python

keras

deep-learning

pytorch

batch-normalization

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!