Why does my CIFAR 100 CNN model mainly predict two classes?

前端 未结 4 1107
不思量自难忘°
不思量自难忘° 2021-01-17 00:06

I am currently trying to get a decent score (> 40% accuracy) with Keras on CIFAR 100. However, I\'m experiencing a weird behaviour of a CNN model: It tends to predict some c

4条回答
  •  南笙
    南笙 (楼主)
    2021-01-17 00:47

    I don't have a good feeling with this part of the code:

    model.add(Dense(1024))
    model.add(Activation('tanh'))
    model.add(Dropout(0.5))
    model.add(Dense(nb_classes))
    model.add(Activation('softmax'))
    

    The remaining model is full of relus, but here there is a tanh.

    tanh sometimes vanishes or explodes (saturates at -1 and 1), which might lead to your 2-class overimportance.

    keras-example cifar 10 basically uses the same architecture (dense-layer sizes might be different), but also uses a relu there (no tanh at all). The same goes for this external keras-based cifar 100 code.

提交回复
热议问题