发表新帖

发表新帖

What is a policy in reinforcement learning? [closed]

前端未结

关注

 3  1115

悲哀的现实 2021-01-30 07:23

3条回答

余生分开走 (楼主)

2021-01-30 07:39
Here is a succinct answer: a policy is the 'thinking' of the agent. It's the mapping of when you are in some state s, which action a should the agent take now? You can think of policies as a lookup table:
```
state----action----probability/'goodness' of taking the action
  1         1                     0.6
  1         2                     0.4
  2         1                     0.3
  2         2                     0.7
```
If you are in state 1, you'd (assuming a greedy strategy) pick action 1. If you are in state 2, you'd pick action 2.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题