I want to use LSTM to predict Y (size = [batch_size, time_length=1, feature_dim=3]) from X (size = [batch_size, time_length=42, feature_dim=12]). The problem is using data (