CNN train accuracy gets better during training, but test accuracy stays around 40%

前端 未结 1 467
轻奢々
轻奢々 2021-01-03 11:32

So in the past few months I\'ve been learning a lot about neural networks with Tensorflow and Keras, so I wanted to try to make a model for the CIFAR10 dataset (code below).

相关标签:
1条回答
  • 2021-01-03 11:33

    You haven't included how you prepare the data, here's one addition that made this network learn much better:

    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    x_train /= 255
    x_test /= 255
    

    If you do data normalization like that, then your network is fine: it hits ~65-70% test accuracy after 5 epochs, which is a good result. Note that 5 epochs is just a start, it would need around 30-50 epochs to really learn the data well and show a result close to state of the art.

    Below are some minor improvements that I noticed and can get you extra performance points:

    • Since you're using ReLu based network, he_normal initializer is better than glorot_uniform (which is a default in Conv2D).
    • It is strange to decrease the number of filters as you go deeper in the network. You should do right the opposite. I changed 256 -> 64 and 128 -> 256 and the accuracy improved.
    • I decreased the dropout slightly 0.5 -> 0.4.
    • Kernel size 3x3 is more common than 2x2. I think you should try it for the second conv layer as well. In fact, you can play with all hyper-parameters to find the best combination.

    Here's the final code:

    (x_train, y_train), (x_test, y_test) = cifar10.load_data()
    print('x_train shape:', x_train.shape)
    print(x_train.shape[0], 'train samples')
    print(x_test.shape[0], 'test samples')
    
    y_train = to_categorical(y_train, 10)
    y_test = to_categorical(y_test, 10)
    
    model = Sequential()
    model.add(Conv2D(filters=64,
                     kernel_size=(3, 3),
                     activation='relu',
                     kernel_initializer='he_normal',
                     input_shape=(32, 32, 3)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(filters=256,
                     kernel_size=(2, 2),
                     kernel_initializer='he_normal',
                     activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(1024, activation='relu'))
    model.add(Dropout(0.4))
    model.add(Dense(10, activation='softmax'))
    
    model.compile(optimizer=adam(),
                  loss=categorical_crossentropy,
                  metrics=['accuracy'])
    
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    x_train /= 255
    x_test /= 255
    
    model.fit(x_train, y_train,
              batch_size=500,
              epochs=5,
              verbose=1,
              validation_data=(x_test, y_test))
    
    loss, accuracy = model.evaluate(x_test, y_test)
    print('loss: ', loss, '\naccuracy: ', accuracy)
    

    The result after 5 epochs:

    loss:  0.822134458447 
    accuracy:  0.7126
    

    By the way, you might be interested to compare your approach with keras example CIFAR-10 conv net.

    0 讨论(0)
提交回复
热议问题