Genetic algorithm for optimization in game playing agent heuristic evaluation function

问题

This is in response to an answer given in this question: How to create a good evaluation function for a game?, particularly by @David (it is the first answer).

Background: I am using a genetic algorithm to optimize the hyper parameters in a game playing agent that is using minimax / alpha beta pruning (with iterative deepening). In particular, I would like to optimize the heuristic (evaluation) function parameters using a genetic algorithm. The evaluation function I use is:

f(w) = w * num_my_moves - (1-w) * num_opponent_moves

The only parameter to optimize is w in [0,1].

Here's how I programmed the genetic algorithm:

Create a random population of say 100 agents
Let them play 1000 games at random with replacement.
Let the parents be the top performing agents with some poorer performing agents mixed in for genetic diversity.
Randomly breed some parents to create children. * Breeding process: We define a child to be an average of the weights of its parents. i.e. childWeight = 0.5(father.w+ mother.w)
The new population is formed by the parents and the newly created children.
Randomly mutate 1% of the population as follows: newWeight = agent.x + random.uniform(-0.01,0.01) and account for trivial border cases (i.e. less than zero and greater than one, appropriately).
Evolve 10 times (i.e. repeat for new population)

My question: Please evaluate the bold points above. In particular, does anyone have a better way to breed (rather than trivially averaging the parent weights), and does anyone have a better way to mutate, rather than just adding random.uniform(-0.01,0.01)?

回答1:

It looks like you're not actually applying a genetic-algorithm to your agents, but rather just simple evolution directly on the phenotype/weights. I suggest you try introducing a genetic representation of your weights, and evolve this genome instead. An example would be to represent your weights as a binary string, and apply evolution on each bit of the string, meaning there is a likelihood that each bit gets mutated. This is called point mutations. There are many other mutations you can apply, but it would do as a start.

What you will notice is that your agents don't get stuck in local minima as much because sometimes a small genetic change can vastly change the phenotype/weights.

Ok, that might sound complicated, it's not really. Let me give you an example:

Say you have a weight of 42 in base 10. This would be 101010 in binary. Now you have implemented a 1% mutation rate on each bit of the binary representation. Let's say the last bit is flipped. Then we have 101011 in binary, or 43 in decimal. Not such a big change. Doing the same with the second bit on the other hand gives you 111010 in binary or 58 decimal. Notice the big jump. This is what we want, and lets your agent population search a larger part of the solution space faster.

With regard to breeding. You can try crossover. Lets assume you have many weights each with a genetic encoding. If you represent the whole genome (all the binary data) as one long binary string you can combine sections of the two parents genome. Example, again. The following is the "father" and "mother" genome and phenotype:

Weight Name:          W1     W2     W3     W4     W5
Father Phenotype:     43     15     34     17     14
Father Genome:    101011 001111 100010 010001 001110
Mother Genome:    100110 100111 011001 010100 101000
Mother Phenotype:     38     39     25     20     40

What you can do is draw arbitrary lines through both genomes at the same place, and assign the segments arbitrarily to the child. This is a version of crossover.

Weight Name:          W1     W2     W3     W4     W5
Father Genome:    101011 00.... ...... .....1 001110
Mother Genome:    ...... ..0111 011001 01010. ......
Child Genome:     101011 000111 011001 010101 001110
Child Phenotype:      43      7     25     21     14

Here the first 8 and the last 7 bits come from the father, and the middle comes from the mother. Notice how weight W1 and W5 are entirely from the father, and W3 is entirely from the mother. While W2 and W4 are combinations. W4 had hardly any change, while W2 has changed drastically.

I hope this gives you some insight in how to do genetic algorithms. That said, I recommend using a modern library instead of implementing it yourself, unless you are doing it to learn.

Edit: More on handling the weights/binary representation:

If you need fractions, you can do this by separating the numerator and denominator as different weights, or have one of them as a constant, e.g., 42 and 10 gives 4.2.)
Larger than 0 constraints come free. To actually get negative numbers you need to negate your weights.
Less than 1 constraint you can get by dividing the weight by the maximum possible value for that bit string length. In the examples above you have 6 bits, which can become a maximum of 63. If you then after mutation get a binary string of 101010 or 42 in base 10, you do 42/63 getting 0.667 and can only ever get as high as 1.0, when 63/63.
Two weights' sum equal to 1? If you get 101010 and 001000 for W1 and W2, it gives 42 and 8, then you can go W1_scaled = W1 / (W1 + W2) = 0.84 and W2_scaled = W2 / (W1 + W2) = 0.16. This should give you W1_scaled + W2_scaled = 1 always.

回答2:

Since I was mentioned.

Rather than averaging the parent weights, I picked random numbers using the parent weights as a min/max. I additionally found I had to widen the range slightly (compensating for the reduction in standard deviation when I'd average two uniform random numbers, or sqrt(2), but I probably wasn't exact) to resist the pull toward the average. Otherwise the population converges toward the average and can't escape.

So if the parents' weights were 0.1 and 0.2, it might pick a random number between 0.08 and 0.22 for the child weight.

Late edit: A more accepted, studied, understood approach that I didn't know at the time is something called "Differential Evolution".

来源：https://stackoverflow.com/questions/45201979/genetic-algorithm-for-optimization-in-game-playing-agent-heuristic-evaluation-fu

标签

optimization

artificial-intelligence

genetic-algorithm

depth-first-search

game-theory