I am trying to solve a control problem with DDPG. The problem is simple enough so that I can do value function iteration for its discretized version, and thus I have the "p