可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I am attempting to create a custom environment for reinforcement learning with openAI gym. I need to represent all possible values that the environment will see in a variable called observation_space. There are 3 possible actions for the agent to use called action_space

To be more specific the observation_space is a temperature sensor which will see possible ranges from 50 to 150 degrees and I think I can represent all of this by:

EDIT, I had the action_space numpy array wrong

import numpy as np action_space = np.array([ 0,  1,  2]) observation_space = np.arange(50,150,1)

Is there a better method that I could use for the observation_space where I could bin the data? IE, make 20 bins 50-55, 55-60, 60-65, etc...

I think what I have will work but seems sort of cumbersome... And I am sure there is a better practice as there is not a lot of wisdom on my end this subject. This will print out a Q table:

action_size = action_space.shape[0] state_size = observation_space.shape[0]  qtable = np.zeros((state_size, action_size)) print(qtable)

回答1:

This is not really related to programming, so maybe on stats.stackexchange you may get better answers. Anyway, it just depends on how much accuracy you want. I guess you want to change the temperature (increase, decrease, don't change) according to the sensor readings. Is there much different (in terms of optimal action) between 50 and 51? If not, then you can discretize the state space every 2 degrees. And so on.

More generally, doing so you are using what in RL are called "features". A discretization over an interval of the state space is called tile coding and usually works well.

If you are new to RL, I really advise to read this book, or at least Chapters 1,3,4 which are related to what you are doing.

文章来源: python binning data openAI gym

标签

gym

python