发表新帖

发表新帖

What is the difference between value iteration and policy iteration?

前端未结

关注

 5  981

陌清茗 2021-01-29 17:44

In reinforcement learning, what is the difference between policy iteration and value iteration?

As much as I understand, in value iteration, you use t

5条回答

悲哀的现实 (楼主)

2021-01-29 18:16

The basic difference is -

In Policy Iteration - You randomly select a policy and find value function corresponding to it , then find a new (improved) policy based on the previous value function, and so on this will lead to optimal policy .

In Value Iteration - You randomly select a value function , then find a new (improved) value function in an iterative process, until reaching the optimal value function , then derive optimal policy from that optimal value function .

Policy iteration works on principle of “Policy evaluation —-> Policy improvement”.

Value Iteration works on principle of “ Optimal value function —-> optimal policy”.

0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...

热议问题