System Details

The system plays checkers using a standard minimax search algorithm and uses alpha-beta pruning to reduce the size of the search tree, thereby increasing the number of moves it can look ahead in a reasonable amount of time. As it is implemented, the minimax search is cut off at an arbitrary, uniform depth. Using forward pruning might significantly improve performance by allowing the search to progress deeper on the more "promising" paths; combining this with a form of iterative deepening would allow the computer to play in time-limited games and always make the best move it has found so far.
[Look at a minimax search example]

The utility function is a simple linear combination of a set of factors, which are taken from A. L. Samuel's checkers program of 1959. Most of the factors return a small integer value that is a function of the number and distribution of the pieces on the game board. Samuel's program was successful for its time, and the factors he used were based on discussions with checkers masters, research, and his own program's evaluation of their importance. Using these pre-tested factors removes some variability from the problem being studied.

In Samuel's original program, the coefficients of the terms of the utility function could assume any value between -2¹⁸ and +2¹⁸, and in his final results, he reports coefficients ranging from -2¹⁸ to +2¹⁶, with values distributed fairly evenly in between. In this program, the coefficients can have any value within the range of a Java double. For this reason, the standard genetic algorithm is modified slightly. Instead of creating the initial population at random, the coefficients of the utility function are always initialized to 1 at the start of the learning process. Ultimately, this should not have any effect on the ability of the system to learn, but it does introduce some consistency that makes evaluation of the system easier.

To determine the fitnesses of the population members, and hence their probability of reproduction, the members are randomly paired and play against each another in a single-elimination tournament. One win in the tournament is worth 10 fitness points. In case of a tie, the player with more pieces is awarded 5 fitness points. When the tournament is complete, each individual's score is divided by the total number of fitness points that were awarded.
[Look at a fitness tournament example]

A single-point crossover operator is used. When the operator is applied to two parents, a random point is chosen in the weight vectors, and the individuals exchange the weights that fall to its right.

Because the weights vary over many orders of magnitude, it was found that the mutation operator worked best when the amount of mutation was proportional to the current value of the coefficient. Instead of choosing a random number somewhere in the possible range of the coefficients, a random number is chosen in the range [0, 1). This number is squared so that the distribution is no longer even, but is heavily shifted towards zero. The present value of the coefficient is multiplied by this result to determine the mutation amount, which is then added or subtracted from the current value to determine the new value. (If the current value of a coefficient is zero, the new value is taken to be the squared random number itself.)
[Look at a crossover and mutation example]

This process provides a better approximation of multiple inheritance in biology, where many genes code for a single trait (for example, height or intelligence). The trait is determined by the genes collectively, so a mutation in one of the genes is less likely to produce a dramatic change in an individual than a mutation in a gene that codes for a trait in its entirety. However, it is still possible for mutation to result in a major change; in this case, a coefficient can range from zero to twice its current value.