I am using GRU and LSTM algorithms to analyse some network traffic dataset, I often get GRU training time longer than LSTM.
Is this normally ?