I use LSTM network to train my machine translation model. Please help me to explain: How the selection of maximum sequence length affects the training process and the final