Training and Testing accuracy not increasing for a CNN followed by a RNN for signature verification

孤街浪徒 提交于 2020-01-15 05:32:45

问题


I'm currently working on online signature verification. The dataset has a variable shape of (x, 7) where x is the number of points a person used to sign their signature. I have the following model:

    model = Sequential()
    #CNN
    model.add(Conv1D(filters=64, kernel_size=3, activation='sigmoid', input_shape=(None, 7)))
    model.add(MaxPooling1D(pool_size=3))
    model.add(Conv1D(filters=64, kernel_size=2, activation='sigmoid'))

    #RNN
    model.add(Masking(mask_value=0.0))
    model.add(LSTM(8))
    model.add(Dense(2, activation='softmax'))

    opt = Adam(lr=0.0001)
    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
    model.summary()

    print(model.fit(x_train, y_train, epochs=100, verbose=2, batch_size=50))

    score, accuracy = model.evaluate(x_test,y_test, verbose=2)
    print(score, accuracy)

I know it may not be the best model but this is the first time I'm building a neural network. I have to use a CNN and RNN as it is required for my honours project. At the moment, I achieve 0.5142 as the highest training accuracy and 0.54 testing accuracy. I have tried increasing the number of epochs, changing the activation function, add more layers, moving the layers around, changing the learning rate and changing the optimizer.

Please share some advice on changing my model or dataset. Any help is much appreciated.


回答1:


For CNN-RNN, some promising things to try:

  • Conv1D layers: activation='relu', kernel_initializer='he_normal'
  • LSTM layer: activation='tanh', and recurrent_dropout=.1, .2, .3
  • Optimizer: Nadam, lr=2e-4 (Nadam may significantly outperform all other optimizers for RNNs)
  • batch_size: lower it. Unless you have 200+ batches in total, set batch_size=32; lower batch size better exploits the Stochastic mechanism of the optimizer and can improve generalization
  • Dropout: right after second Conv1D, with a rate .1, .2 - or, after first Conv1D, with a rate .25, .3, but only if you use SqueezeExcite (see below), else MaxPooling won't work as well
  • SqueezeExcite: shown to enhance all CNN performance across a large variety of tasks; Keras implementation you can use below
  • BatchNormalization: while your model isn't large, it's still deep, and may benefit from one BN layer right after second Conv1D
  • L2 weight decay: on first Conv1D, to prevent it from memorizing the input; try 1e-5, 1e-4, e.g. kernel_regularizer=l2(1e-4) # from keras.regularizers import l2
  • Preprocessing: make sure all data is normalized (or standardized if time-series), and batches are shuffled each epoch
def SqueezeExcite(_input):
    filters = _input._keras_shape[-1]

    se = GlobalAveragePooling1D()(_input)
    se = Reshape((1, filters))(se)
    se = Dense(filters//16,activation='relu',   
               kernel_initializer='he_normal', use_bias=False)(se)
    se = Dense(filters,    activation='sigmoid',
               kernel_initializer='he_normal', use_bias=False)(se)

    return multiply([_input, se])
# Example usage
x = Conv1D(filters=64, kernel_size=4, activation='relu', kernel_initializer='he_normal')(x)
x = SqueezeExcite(x) # place after EACH Conv1D


来源:https://stackoverflow.com/questions/58043002/training-and-testing-accuracy-not-increasing-for-a-cnn-followed-by-a-rnn-for-sig

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!