I am attempting to teach a Double DQN agent to run a gridworld where there is one seeker (the agent) who will try to collect all the hiders which are randomly spawned. Every ste