Q-Learning values get too high

后端未结

关注

 2  1578

忘掉有多难 2021-01-06 12:30

I\'ve recently made an attempt to implement a basic Q-Learning algorithm in Golang. Note that I\'m new to Reinforcement Learning and AI in general, so the error may very wel

2条回答

不思量自难忘° (楼主)

2021-01-06 12:43

If I've understood well, in your Q-learning update rule, you are using the current reward and the previous reward. However, the Q-learning rule only uses one reward (x are states and u are actions):

On the other hand, you are assuming that the current reward is the same that Qmax value, which is not true. So probably you are misunderstanding the Q-learning algorithm.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...