I\'m having a hard time figuring out why my simple DQN isn\'t learning correctly. I have a gridworld environment with 12 states as a toy example. I notice two problems: