q-learning | 易学教程

Reward function for learning to play Curve Fever game with DQN

阅读更多关于 Reward function for learning to play Curve Fever game with DQN

问题 I've made a simple version of Curve Fever also known as "Achtung Die Kurve". I want the machine to figure out how to play the game optimally. I copied and slightly modified an existing DQN from some Atari game examples that is made with Google's Tensorflow. I'm tyring to figure out an appropriate reward function. Currently, I use this reward setup: 0.1 for every frame it does not crash -500 for every crash Is this the right approach? Do I need to tweak the values? Or do I need a completely

Low GPU utilisation when running Tensorflow

阅读更多关于 Low GPU utilisation when running Tensorflow

问题 I've been doing Deep Reinforcement Learning using Tensorflow and OpenAI gym. My problem is low GPU utilisation. Googling this issue, I understood that it's wrong to expect much GPU utilisation when training small networks ( eg. for training mnist). But my Neural Network is not so small, I think. The architecture is similar to the given in the original deepmind paper (more or less). The architecture of my network is summarized below Convolution layer 1 (filters=32, kernel_size=8x8, strides=4)

Low GPU utilisation when running Tensorflow

阅读更多关于 Low GPU utilisation when running Tensorflow

How does DQN work in an environment where reward is always -1

阅读更多关于 How does DQN work in an environment where reward is always -1

问题 Given that the OpenAI Gym environment MountainCar-v0 ALWAYS returns -1.0 as a reward (even when goal is achieved), I don't understand how DQN with experience-replay converges, yet I know it does, because I have working code that proves it. By working, I mean that when I train the agent, the agent quickly (within 300-500 episodes) learns how to solve the mountaincar problem. Below is an example from my trained agent. It is my understanding that ultimately there needs to be a "sparse reward"

QLearning network in a custom environment is choosing the same action every time, despite the heavy negative reward

阅读更多关于 QLearning network in a custom environment is choosing the same action every time, despite the heavy negative reward

问题 So I plugged QLearningDiscreteDense into a dots and boxes game I made. I created a custom MDP environment for it. The problem is that it chooses action 0 each time, the first time it works but then it's not an available action anymore so it's an illegal move. I give illegal moves a reward of Integer.MIN_VALUE , but it doesn't affect anything. Here's the MDP class: public class testEnv implements MDP<testState, Integer, DiscreteSpace> { final private int maxStep; DiscreteSpace actionSpace =

QLearning network in a custom environment is choosing the same action every time, despite the heavy negative reward

阅读更多关于 QLearning network in a custom environment is choosing the same action every time, despite the heavy negative reward

incompatible array types are mixed in the forward input (LinearFunction) in machine learning

阅读更多关于 incompatible array types are mixed in the forward input (LinearFunction) in machine learning

来源： https://stackoverflow.com/questions/64794996/incompatible-array-types-are-mixed-in-the-forward-input-linearfunction-in-mach

incompatible array types are mixed in the forward input (LinearFunction) in machine learning

阅读更多关于 incompatible array types are mixed in the forward input (LinearFunction) in machine learning

来源： https://stackoverflow.com/questions/64794996/incompatible-array-types-are-mixed-in-the-forward-input-linearfunction-in-mach

incompatible array types are mixed in the forward input (LinearFunction) in machine learning

阅读更多关于 incompatible array types are mixed in the forward input (LinearFunction) in machine learning

来源： https://stackoverflow.com/questions/64794996/incompatible-array-types-are-mixed-in-the-forward-input-linearfunction-in-mach

Deep Q Learning WITHOUT OpenAI Gym

阅读更多关于 Deep Q Learning **WITHOUT** OpenAI Gym

来源： https://stackoverflow.com/questions/61526437/deep-q-learning-without-openai-gym