Training and Testing accuracy not increasing for a CNN followed by a RNN for signature verification

问题

I'm currently working on online signature verification. The dataset has a variable shape of (x, 7) where x is the number of points a person used to sign their signature. I have the following model:

    model = Sequential()
    #CNN
    model.add(Conv1D(filters=64, kernel_size=3, activation='sigmoid', input_shape=(None, 7)))
    model.add(MaxPooling1D(pool_size=3))
    model.add(Conv1D(filters=64, kernel_size=2, activation='sigmoid'))

    #RNN
    model.add(Masking(mask_value=0.0))
    model.add(LSTM(8))
    model.add(Dense(2, activation='softmax'))

    opt = Adam(lr=0.0001)
    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
    model.summary()

    print(model.fit(x_train, y_train, epochs=100, verbose=2, batch_size=50))

    score, accuracy = model.evaluate(x_test,y_test, verbose=2)
    print(score, accuracy)

I know it may not be the best model but this is the first time I'm building a neural network. I have to use a CNN and RNN as it is required for my honours project. At the moment, I achieve 0.5142 as the highest training accuracy and 0.54 testing accuracy. I have tried increasing the number of epochs, changing the activation function, add more layers, moving the layers around, changing the learning rate and changing the optimizer.

Please share some advice on changing my model or dataset. Any help is much appreciated.

回答1:

For CNN-RNN, some promising things to try:

Conv1D layers: activation='relu', kernel_initializer='he_normal'
LSTM layer: activation='tanh', and recurrent_dropout=.1, .2, .3
Optimizer: Nadam, lr=2e-4 (Nadam may significantly outperform all other optimizers for RNNs)
batch_size: lower it. Unless you have 200+ batches in total, set batch_size=32; lower batch size better exploits the Stochastic mechanism of the optimizer and can improve generalization
Dropout: right after second Conv1D, with a rate .1, .2 - or, after first Conv1D, with a rate .25, .3, but only if you use SqueezeExcite (see below), else MaxPooling won't work as well
SqueezeExcite: shown to enhance all CNN performance across a large variety of tasks; Keras implementation you can use below
BatchNormalization: while your model isn't large, it's still deep, and may benefit from one BN layer right after second Conv1D
L2 weight decay: on first Conv1D, to prevent it from memorizing the input; try 1e-5, 1e-4, e.g. kernel_regularizer=l2(1e-4) # from keras.regularizers import l2
Preprocessing: make sure all data is normalized (or standardized if time-series), and batches are shuffled each epoch

def SqueezeExcite(_input):
    filters = _input._keras_shape[-1]

    se = GlobalAveragePooling1D()(_input)
    se = Reshape((1, filters))(se)
    se = Dense(filters//16,activation='relu',   
               kernel_initializer='he_normal', use_bias=False)(se)
    se = Dense(filters,    activation='sigmoid',
               kernel_initializer='he_normal', use_bias=False)(se)

    return multiply([_input, se])

# Example usage
x = Conv1D(filters=64, kernel_size=4, activation='relu', kernel_initializer='he_normal')(x)
x = SqueezeExcite(x) # place after EACH Conv1D

来源：https://stackoverflow.com/questions/58043002/training-and-testing-accuracy-not-increasing-for-a-cnn-followed-by-a-rnn-for-sig

标签

python

keras

conv-neural-network

recurrent-neural-network