CNN train accuracy gets better during training, but test accuracy stays around 40%

前端未结

关注

 1  467

So in the past few months I\'ve been learning a lot about neural networks with Tensorflow and Keras, so I wanted to try to make a model for the CIFAR10 dataset (code below).

相关标签:

1条回答

轻奢々

2021-01-03 11:33

You haven't included how you prepare the data, here's one addition that made this network learn much better:

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

If you do data normalization like that, then your network is fine: it hits ~65-70% test accuracy after 5 epochs, which is a good result. Note that 5 epochs is just a start, it would need around 30-50 epochs to really learn the data well and show a result close to state of the art.

Below are some minor improvements that I noticed and can get you extra performance points:

Since you're using ReLu based network, he_normal initializer is better than glorot_uniform (which is a default in Conv2D).
It is strange to decrease the number of filters as you go deeper in the network. You should do right the opposite. I changed 256 -> 64 and 128 -> 256 and the accuracy improved.
I decreased the dropout slightly 0.5 -> 0.4.
Kernel size 3x3 is more common than 2x2. I think you should try it for the second conv layer as well. In fact, you can play with all hyper-parameters to find the best combination.

Here's the final code:

(x_train, y_train), (x_test, y_test) = cifar10.load_data()
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

model = Sequential()
model.add(Conv2D(filters=64,
                 kernel_size=(3, 3),
                 activation='relu',
                 kernel_initializer='he_normal',
                 input_shape=(32, 32, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(filters=256,
                 kernel_size=(2, 2),
                 kernel_initializer='he_normal',
                 activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(1024, activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(10, activation='softmax'))

model.compile(optimizer=adam(),
              loss=categorical_crossentropy,
              metrics=['accuracy'])

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

model.fit(x_train, y_train,
          batch_size=500,
          epochs=5,
          verbose=1,
          validation_data=(x_test, y_test))

loss, accuracy = model.evaluate(x_test, y_test)
print('loss: ', loss, '\naccuracy: ', accuracy)

The result after 5 epochs:

loss:  0.822134458447 
accuracy:  0.7126

By the way, you might be interested to compare your approach with keras example CIFAR-10 conv net.

0 讨论(0)