Convolutional neural network outputting equal probabilities for all labels

前端 未结 1 1998
花落未央
花落未央 2021-01-24 12:17

I am currently training a CNN on MNIST, and the output probabilities (softmax) are giving [0.1,0.1,...,0.1] as training goes on. The initial values aren\'t uniform, so I can\'t

相关标签:
1条回答
  • 2021-01-24 12:47

    There are several issues with your code, including elementary ones. I strongly suggest you first go through the Tensorflow step-by-step tutorials for MNIST, MNIST For ML Beginners and Deep MNIST for Experts.

    In short, regarding your code:

    First, your final layer fc2 should not have a ReLU activation.

    Second, the way you build your batches, i.e.

    indices = np.random.randint(len(tr_data),size=[200])
    

    is by just grabbing random samples in each iteration, which is far from the correct way of doing so...

    Third, the data you feed into the network are not normalized in [0, 1], as they should be:

    np.max(tr_data[0]) # get the max value of your first training sample
    # 255.0
    

    The third point was initially puzzling for me, too, since in the aforementioned Tensorflow tutorials they don't seem to normalize the data either. But close inspection revealed the reason: if you import the MNIST data through the Tensorflow-provided utility functions (instead of the scikit-learn ones, as you do here), they come already normalized in [0, 1], something that is nowhere hinted at:

    from tensorflow.examples.tutorials.mnist import input_data
    import tensorflow as tf
    import numpy as np
    
    mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
    np.max(mnist.train.images[0])
    # 0.99607849
    

    This is an admittedly strange design decision - as far as I am aware of, in all other similar cases/tutorials normalizing the input data is an explicit part of the pipeline (see e.g. the Keras example), and with good reason (it is something you will be certainly expected to do yourself later, when using your own data).

    0 讨论(0)
提交回复
热议问题