I\'m currently working on a deep q reinforcement learning model in tf-agents based on a OpenAI gym. The game that is being played by the agent is a multi agent card game with 4