Why does my CIFAR 100 CNN model mainly predict two classes?

前端未结

关注

 4  1104

I am currently trying to get a decent score (> 40% accuracy) with Keras on CIFAR 100. However, I\'m experiencing a weird behaviour of a CNN model: It tends to predict some c

相关标签:

4条回答

南笙

2021-01-17 00:47
I don't have a good feeling with this part of the code:
```
model.add(Dense(1024))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
```
The remaining model is full of relus, but here there is a tanh.

tanh sometimes vanishes or explodes (saturates at -1 and 1), which might lead to your 2-class overimportance.

keras-example cifar 10 basically uses the same architecture (dense-layer sizes might be different), but also uses a relu there (no tanh at all). The same goes for this external keras-based cifar 100 code.
0 讨论(0)
发布评论:

提交评论
- 加载中...
情话喂你

2021-01-17 00:49
If you get good accuracy during training and validation, but not when testing, make sure you do exactly the same preprocessing on your dataset in both cases. Here you have when training:
```
X_train /= 255
X_val /= 255
X_test /= 255
```
But no such code when predicting for your confusion matrix. Adding to testing:
```
X_val /=  255.
```
Gives the following nice looking confusion matrix:
0 讨论(0)
发布评论:

提交评论
- 加载中...
执念已碎

2021-01-17 01:03
1. I don't see you doing mean-centering, even in datagen. I suspect this is the main cause. To do mean centering using ImageDataGenerator, set featurewise_center = 1. Another way is to subtract the ImageNet mean from each RGB pixel. The mean vector to be subtracted is [103.939, 116.779, 123.68].
2. Make all activations relus, unless you have a specific reason to have a single tanh.
3. Remove two dropouts of 0.25 and see what happens. If you want to apply dropouts to convolution layer, it is better to use SpatialDropout2D. It is somehow removed from Keras online documentation but you can find it in the source.
4. You have two conv layers with same and two with valid. There is nothing wrong in this, but it would be simpler to keep all conv layers with same and control your size just based on max-poolings.
0 讨论(0)
发布评论:

提交评论
- 加载中...
感情败类

2021-01-17 01:05
One important part of the problem was that my ~/.keras/keras.json was
```
{
    "image_dim_ordering": "th",
    "epsilon": 1e-07,
    "floatx": "float32",
    "backend": "tensorflow"
}
```
Hence I had to change image_dim_ordering to tf. This leads to

and an accuracy of 12.73%. Obviously, there is still a problem as the validation history gave 45.1% accuracy.
0 讨论(0)
发布评论:

提交评论
- 加载中...