I\'m trying to implement the recurrent neural network architecture proposed in this paper (https://arxiv.org/abs/1611.03824), where the authors use a LSTM to minimize a blac