I have a QA system model implemented by using Torch. The dataset has 880 questions and each question has a single word answer. When I run the program with batch_size=1, it w