问题
I'm currently working on online signature verification. The dataset has a variable shape of (x, 7) where x is the number of points a person used to sign their signature. I have the following model:
model = Sequential()
#CNN
model.add(Conv1D(filters=64, kernel_size=3, activation='sigmoid', input_shape=(None, 7)))
model.add(MaxPooling1D(pool_size=3))
model.add(Conv1D(filters=64, kernel_size=2, activation='sigmoid'))
#RNN
model.add(Masking(mask_value=0.0))
model.add(LSTM(8))
model.add(Dense(2, activation='softmax'))
opt = Adam(lr=0.0001)
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()
print(model.fit(x_train, y_train, epochs=100, verbose=2, batch_size=50))
score, accuracy = model.evaluate(x_test,y_test, verbose=2)
print(score, accuracy)
I know it may not be the best model but this is the first time I'm building a neural network. I have to use a CNN and RNN as it is required for my honours project. At the moment, I achieve 0.5142 as the highest training accuracy and 0.54 testing accuracy. I have tried increasing the number of epochs, changing the activation function, add more layers, moving the layers around, changing the learning rate and changing the optimizer.
Please share some advice on changing my model or dataset. Any help is much appreciated.
回答1:
For CNN-RNN, some promising things to try:
- Conv1D layers:
activation='relu'
,kernel_initializer='he_normal'
- LSTM layer:
activation='tanh'
, andrecurrent_dropout=.1, .2, .3
- Optimizer:
Nadam
,lr=2e-4
(Nadam may significantly outperform all other optimizers for RNNs) - batch_size: lower it. Unless you have 200+ batches in total, set
batch_size=32
; lower batch size better exploits the Stochastic mechanism of the optimizer and can improve generalization - Dropout: right after second
Conv1D
, with a rate.1, .2
- or, after firstConv1D
, with a rate.25, .3
, but only if you use SqueezeExcite (see below), elseMaxPooling
won't work as well - SqueezeExcite: shown to enhance all CNN performance across a large variety of tasks; Keras implementation you can use below
- BatchNormalization: while your model isn't large, it's still deep, and may benefit from one BN layer right after second
Conv1D
- L2 weight decay: on first
Conv1D
, to prevent it from memorizing the input; try1e-5, 1e-4
, e.g.kernel_regularizer=l2(1e-4) # from keras.regularizers import l2
- Preprocessing: make sure all data is normalized (or standardized if time-series), and batches are shuffled each epoch
def SqueezeExcite(_input):
filters = _input._keras_shape[-1]
se = GlobalAveragePooling1D()(_input)
se = Reshape((1, filters))(se)
se = Dense(filters//16,activation='relu',
kernel_initializer='he_normal', use_bias=False)(se)
se = Dense(filters, activation='sigmoid',
kernel_initializer='he_normal', use_bias=False)(se)
return multiply([_input, se])
# Example usage
x = Conv1D(filters=64, kernel_size=4, activation='relu', kernel_initializer='he_normal')(x)
x = SqueezeExcite(x) # place after EACH Conv1D
来源:https://stackoverflow.com/questions/58043002/training-and-testing-accuracy-not-increasing-for-a-cnn-followed-by-a-rnn-for-sig