发表新帖

发表新帖

What is a policy in reinforcement learning? [closed]

前端未结

关注

 3  1108

悲哀的现实 2021-01-30 07:23

3条回答

伪装坚强ぢ (楼主)

2021-01-30 07:54

In plain words, in the simplest case, a policy π is a function that takes as input a state s and returns an action a. That is: π(s) → a

In this way, the policy is typically used by the agent to decide what action a should be performed when it is in a given state s.

Sometimes, the policy can be stochastic instead of deterministic. In such a case, instead of returning a unique action a, the policy returns a probability distribution over a set of actions.

In general, the goal of any RL algorithm is to learn an optimal policy that achieve a specific goal.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题