Out of curiosity, I compared a stacked LSTM neural network with a single time step with MLP with tanh activation function, thinking they would have the same performance. The arc