Test accuracy cannot improve when learning ZFNet on ILSVRC12

岁酱吖の 提交于 2019-12-12 03:35:02

问题


I've implemented a home-brewed ZFNet (prototxt) for my research. After 20k iterations with the definition, the test accuracy stays at ~0.001 (i.e., 1/1000), the test loss at ~6.9, and training loss at ~6.9, which seems that the net keeps playing guessing games among the 1k classes. I've thoroughly checked the whole definition and tried to change some of the hyper-parameters to start a new training, but of no avail, same results' shown on the screen....

Could anyone show me some light? Thanks in advance!


The hyper-parameters in the prototxt are derived from the paper [1]. All the inputs and outputs of the layers seems correct as Fig. 3 in the paper suggests.

The tweaks are:

  • crop-s of the input for both training and testing are set to 225 instead of 224 as discussed in #33;

  • one-pixel zero paddings for conv3, conv4, and conv5 to make the sizes of the blobs consistent [1];

  • filler types for all learnable layers changed from constant in [1] to gaussian with std: 0.01;

  • weight_decay: changing from 0.0005 to 0.00025 as suggested by @sergeyk in PR #33;

[1] Zeiler, M. and Fergus, R. Visualizing and Understanding Convolutional Networks, ECCV 2014.

and for the poor part..., I pasted it here


回答1:


A few suggestions:

  1. Change initialization from gauss to xavier.
  2. Work with "PReLU" acitvations, instead of "ReLU". once your net converges you can finetune to remove them.
  3. Try reducing base_lr by an order of magnitude (or even two orders).


来源:https://stackoverflow.com/questions/39663506/test-accuracy-cannot-improve-when-learning-zfnet-on-ilsvrc12

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!