SLIDE 13
Selection pressure totally depends on the fitness landscape, which is usually unknown. It is not translation invariant: on a population of 10 individuals, if the best has a fitness of 11 and the worst a fitness of 1, the probability for the best individual to be chosen is 16.6% and 1.5% for the worst. If one adds 100 to all fitness values, the best and worst individuals have nearly identical probabilities to be chosen (10.4% and 9.5%) ! Things can be partially improved thanks to linear scaling of fitness values, or a sigma truncation, but with the cost of increased complexity (additional parameters to adjust).
Roulette wheel is cpu-consuming, because the population needs to be sorted beforehand, leading to an O(n log n) complexity.
Roulette requires the sum of the fitness values of all the individuals. This is problematic if the evolutionary computation is distributed over several machines. (This is the case for all other selection algorithms but Tournament selection.)
The fitness function needs to yield positive values (which is not really problematic, but due to the fact that Roulette is not translation invariant, shifting the values so that they are all positive has consequences). Other selection methods have been devised in order to mainly circumvent problem number 1:
Ranking (Baker, 1985): Selection is based on rank, not fitness. One also needs to sort the population, leading to an O(n log n) complexity. Problem 1 is solved, but the others remain.
Stochastic Universal Sampling (Baker, 1987): Individuals are assigned slots of a weighted roulette wheel, as for the Roulette selection. n markers are then placed equally around the wheel and the wheel is spun once. The complexity of this algorithm is also O(n log n) and it requires the sum of the fitness values of all the individuals.
Selection in Genitor (Whitley, 1989): Genitor is an evolutionary paradigm that is of the Steady State kind, in which only one individual is created per ``generation.'' The population is initially ranked, after which each new child is inserted at its place and the worst individual of the population is discarded. This requires O(log n) steps, that need to be repeated n times in order to simulate the creation of a whole population, so the complexity of the algorithm is O(n log n). The Genitor selection and replacement scheme may lead to premature convergence, which is why large population sizes are suggested.
Truncation selection (Mühlenbein & Schlierkamp-Voosen 1993): This is the selection method used by breeders. Only the T best individuals are considered, and all of them have the same selection probability (random selection among T individuals). The population needs to be sorted first, so complexity is O(n log n). Bad individuals (below threshold T) cannot be selected, so loss of diversity can be important.
Deterministic selection: Only the n best individuals are selected. This method requires sorting the individuals. Loss of diversity is important (as for Truncation selection), and this selection method may lead to premature convergence.
Random: Quick, but no selection pressure. Then, there is n-ary Tournament selection (Brindle, 1981; Blickle & Thiele, 1995). Unless there is a good reason for using any other method, Tournament selection is most certainly the best of
- all. Binary tournament consists in picking two individuals at random, and comparing their fitness.
The individual with the highest fitness wins the tournament and is selected. Selection pressure can be increased by organising a tournament between three or more individuals. In contrast, if