What is the difference between value iteration and policy iteration?

前端 未结 5 981
陌清茗
陌清茗 2021-01-29 17:44

In reinforcement learning, what is the difference between policy iteration and value iteration?

As much as I understand, in value iteration, you use t

5条回答
  •  悲哀的现实
    2021-01-29 18:16

    The basic difference is -

    In Policy Iteration - You randomly select a policy and find value function corresponding to it , then find a new (improved) policy based on the previous value function, and so on this will lead to optimal policy .

    In Value Iteration - You randomly select a value function , then find a new (improved) value function in an iterative process, until reaching the optimal value function , then derive optimal policy from that optimal value function .

    Policy iteration works on principle of “Policy evaluation —-> Policy improvement”.

    Value Iteration works on principle of “ Optimal value function —-> optimal policy”.

提交回复
热议问题