问题
I trained one regression network using resnet50 as backbone. The input of the network is image whose size is 224*224*3, the output of the network is one value, varying from 0 to 1.
but the netwrok can not converge, no matter I use sigmoid or relu as output layer's activation. mae or mse as loss function.
For exampple, I use resnet50 as backbone,mae as loss function, sigmoid is the activation function of output layer. SGD as optimizer. The training loss would be:
Epoch 1 training loss is 0.4900, val_loss is 0.4797
Epoch 2 training loss is 0.4923, val_loss is 0.4794
Epoch 3 training loss is 0.4923, val_loss is 0.4783
...
Epoch 35 training loss is 0.4923, val_loss is 0.4771
The training loss would not change, it is constant 0.4923. the val_loss is always about 0.47. I tested differentoptimizer, learning rate. the network is still not converge.
When I use VGG16 or Mobilenet as backbone, the network converged. Could anyone give me some suggestions about how I can fix this problem.
回答1:
Can you somehow validate if the Resnet50 backbone is correctly implemented. Maybe try to train it on MNIST and see if it works in general.
It kinda seems to me that the ResNet varaint just outputs some mean value instead of learning the actual problem.
Can you give some more information on what you want to achieve. How your regression looks like and what input is expected from the backbone. Also you might want to have a look at similar work (if that exists) and read what architectures they were using and what hyperparameters.
来源:https://stackoverflow.com/questions/59656204/resnet50-does-not-converge-vgg16-works-fine