Can tf.agent policy return probability vector for all actions?
问题 I am trying to train a Reinforcement Learning agent using TF-Agent TF-Agent DQN Tutorial. In my application, I have 1 action containing 9 possible discrete values (labeled from 0 to 8). Below is the output from env.action_spec() BoundedTensorSpec(shape=(), dtype=tf.int64, name='action', minimum=array(0, dtype=int64), maximum=array(8, dtype=int64)) I would like to get the probability vector contains all actions calculated by the trained policy, and do further processing in other application