I am currently trying to get a decent score (> 40% accuracy) with Keras on CIFAR 100. However, I\'m experiencing a weird behaviour of a CNN model: It tends to predict some c
I don't have a good feeling with this part of the code:
model.add(Dense(1024))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
The remaining model is full of relus
, but here there is a tanh
.
tanh
sometimes vanishes or explodes (saturates at -1 and 1), which might lead to your 2-class overimportance.
keras-example cifar 10 basically uses the same architecture (dense-layer sizes might be different), but also uses a relu
there (no tanh at all). The same goes for this external keras-based cifar 100 code.