Reinforcement Learning With Variable Actions

后端未结

关注

 3  1293

All the reinforcement learning algorithms I\'ve read about are usually applied to a single agent that has a fixed number of actions. Are there any reinforcement learning algorit

相关标签:

3条回答

面向向阳花

2021-02-05 21:02

What you describe is nothing unusual. Reinforcement learning is a way of finding the value function of a Markov Decision Process. In an MDP, every state has its own set of actions. To proceed with reinforcement learning application, you have to clearly define what the states, actions, and rewards are in your problem.

0 讨论(0)
发布评论:

提交评论
- 加载中...
一生所求

2021-02-05 21:02
If you have a number of actions for each soldier that are available or not depending on some conditions, then you can still model this as selection from a fixed set of actions. For example:
- Create a "utility value" for each of the full set of actions for each soldier
- Choose the highest valued action, ignoring those actions that are not available at a given time
If you have multiple possible targets, then the same principle applies, except this time you model your utility function to take the target designation as an additional parameter, and run the evaluation function multiple times (one for each target). You pick the target that has the highest "attack utility".
0 讨论(0)
发布评论:

提交评论
- 加载中...
甜味超标

2021-02-05 21:05

In continuous domain action spaces, the policy NN often outputs the mean and/or the variance, from which you, then, sample the action, assuming it follows a certain distribution.

0 讨论(0)
发布评论:

提交评论
- 加载中...