I am wondering how best to feed back the changes my DQN agent makes on its environment, back to itself.
I have a battery model whereby an agent can observe a time-series