Q-learning algorithm doesn't work, because Scikit SGD Regressor predicts imprecise values

后端未结

关注

 0  343

I implemented a q-learning algorithm with a Scikit-Learn SGD-Regressor for Function Approximation. However, the algorithm is not working properly. My theory at the moment is