reinforcement-learning | 易学教程

Are Q-learning and SARSA with greedy selection equivalent?

阅读更多关于 Are Q-learning and SARSA with greedy selection equivalent?

问题 The difference between Q-learning and SARSA is that Q-learning compares the current state and the best possible next state, whereas SARSA compares the current state against the actual next state. If a greedy selection policy is used, that is, the action with the highest action value is selected 100% of the time, are SARSA and Q-learning then identical? 回答1: Well, not actually. A key difference between SARSA and Q-learning is that SARSA is an on-policy algorithm (it follows the policy that is

Are Q-learning and SARSA with greedy selection equivalent?

阅读更多关于 Are Q-learning and SARSA with greedy selection equivalent?

TypeError: len is not well defined for symbolic Tensors. (activation_3/Identity:0) Please call `x.shape` rather than `len(x)` for shape information

阅读更多关于 TypeError: len is not well defined for symbolic Tensors. (activation_3/Identity:0) Please call `x.shape` rather than `len(x)` for shape information

问题 I am trying to implement a DQL model on one game of openAI gym. But it's giving me following error. TypeError: len is not well defined for symbolic Tensors. (activation_3/Identity:0) Please call x.shape rather than len(x) for shape information. Creating a gym environment: ENV_NAME = 'CartPole-v0' env = gym.make(ENV_NAME) np.random.seed(123) env.seed(123) nb_actions = env.action_space.n My model looks like this: model = Sequential() model.add(Flatten(input_shape=(1,) + env.observation_space

Reinforcement Learning doesn't work for this VERY EASY game, why? Q Learning

阅读更多关于 Reinforcement Learning doesn't work for this VERY EASY game, why? Q Learning

问题 I programmed a very easy game which works the following way: Given an 4x4 field of squares, a player can move (up, right, down or left). Going on a square the agent never visited before gives the reward 1. Stepping on "dead-field" is rewarded with -5 and then the game will be resetted. Moving on a field that was already visited is rewarded with -1 Going on the "win-field" (there's exactly one) gives the reward 5 and the game will be resetted as well. Now I want an AI to learn to play that

How to accumulate and appy gradients for Async n-step DQNetwork update in Tensorflow?

阅读更多关于 How to accumulate and appy gradients for Async n-step DQNetwork update in Tensorflow?

问题 I am trying to implement Asynchronous Methods for Deep Reinforcement Learning and one of the steps requires to accumulate the gradient over different steps and then apply it. What is the best way to achieve this in tensorflow? I got so far as to accumulate the gradient and I don't think is the fastest way to achieve it (lots of transfers from tensorflow to python and back). Any suggestions are welcome. This is my code of a toy NN. It does not model or compute anything it just exercise the

How to accumulate and appy gradients for Async n-step DQNetwork update in Tensorflow?

阅读更多关于 How to accumulate and appy gradients for Async n-step DQNetwork update in Tensorflow?

Is it possible to modify OpenAI environments?

阅读更多关于 Is it possible to modify OpenAI environments?

问题 There are some things that I would like to modify in the OpenAI environments. If we use the Cartpole example then we can edit things that are in the class init function but with environments that use Box2D it doesn't seem to be as straight forward? For example, consider the BipedalWalker environment. In this case how would I edit things like the SPEED_HIP or SPEED_KNEE variables? Thanks 回答1: Yes, you can modify or create new environments in gym. The simplest (but not recommended) way is to

Is it possible to modify OpenAI environments?

阅读更多关于 Is it possible to modify OpenAI environments?

Pygame and Open AI implementation

阅读更多关于 Pygame and Open AI implementation

问题 Me and my classmate have decided to try and implement and AI agent into our own game. My friend have done most of the code, based on previous projects, and I was wondering how PyGame and OpenAI would work together. Have tried to do some research but can't really find any useful information about this specific topic. Some have said that it is hard to implement but some say it works. Either way, I'd like your opinion on this project and how you'd approach this if it was you. The game is very

Pygame and Open AI implementation

阅读更多关于 Pygame and Open AI implementation