I have a multi-class sequence labeling problem where the number of time steps varies within samples. To use LSTM with variable-length input, I applied zero padding and maski