python binning data openAI gym

匿名 (未验证) 提交于 2019-12-03 01:36:02

问题:

I am attempting to create a custom environment for reinforcement learning with openAI gym. I need to represent all possible values that the environment will see in a variable called observation_space. There are 3 possible actions for the agent to use called action_space

To be more specific the observation_space is a temperature sensor which will see possible ranges from 50 to 150 degrees and I think I can represent all of this by:

EDIT, I had the action_space numpy array wrong

import numpy as np action_space = np.array([ 0,  1,  2]) observation_space = np.arange(50,150,1) 

Is there a better method that I could use for the observation_space where I could bin the data? IE, make 20 bins 50-55, 55-60, 60-65, etc...

I think what I have will work but seems sort of cumbersome... And I am sure there is a better practice as there is not a lot of wisdom on my end this subject. This will print out a Q table:

action_size = action_space.shape[0] state_size = observation_space.shape[0]  qtable = np.zeros((state_size, action_size)) print(qtable) 

回答1:

This is not really related to programming, so maybe on stats.stackexchange you may get better answers. Anyway, it just depends on how much accuracy you want. I guess you want to change the temperature (increase, decrease, don't change) according to the sensor readings. Is there much different (in terms of optimal action) between 50 and 51? If not, then you can discretize the state space every 2 degrees. And so on.

More generally, doing so you are using what in RL are called "features". A discretization over an interval of the state space is called tile coding and usually works well.

If you are new to RL, I really advise to read this book, or at least Chapters 1,3,4 which are related to what you are doing.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!