All the reinforcement learning algorithms I\'ve read about are usually applied to a single agent that has a fixed number of actions. Are there any reinforcement learning algorit
If you have a number of actions for each soldier that are available or not depending on some conditions, then you can still model this as selection from a fixed set of actions. For example:
If you have multiple possible targets, then the same principle applies, except this time you model your utility function to take the target designation as an additional parameter, and run the evaluation function multiple times (one for each target). You pick the target that has the highest "attack utility".