Convolutional neural network outputting equal probabilities for all labels

前端未结

关注

 1  1997

花落未央 2021-01-24 12:17

I am currently training a CNN on MNIST, and the output probabilities (softmax) are giving [0.1,0.1,...,0.1] as training goes on. The initial values aren\'t uniform, so I can\'t

1条回答

隐瞒了意图╮ (楼主)

2021-01-24 12:47
There are several issues with your code, including elementary ones. I strongly suggest you first go through the Tensorflow step-by-step tutorials for MNIST, MNIST For ML Beginners and Deep MNIST for Experts.

In short, regarding your code:

First, your final layer fc2 should not have a ReLU activation.

Second, the way you build your batches, i.e.
```
indices = np.random.randint(len(tr_data),size=[200])
```
is by just grabbing random samples in each iteration, which is far from the correct way of doing so...

Third, the data you feed into the network are not normalized in [0, 1], as they should be:
```
np.max(tr_data[0]) # get the max value of your first training sample
# 255.0
```
The third point was initially puzzling for me, too, since in the aforementioned Tensorflow tutorials they don't seem to normalize the data either. But close inspection revealed the reason: if you import the MNIST data through the Tensorflow-provided utility functions (instead of the scikit-learn ones, as you do here), they come already normalized in [0, 1], something that is nowhere hinted at:
```
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
import numpy as np

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
np.max(mnist.train.images[0])
# 0.99607849
```
This is an admittedly strange design decision - as far as I am aware of, in all other similar cases/tutorials normalizing the input data is an explicit part of the pipeline (see e.g. the Keras example), and with good reason (it is something you will be certainly expected to do yourself later, when using your own data).
0 讨论(0)
发布评论:

提交评论
- 加载中...