SLIDE 1 ARTIFICIAL INTELLIGENCE
Lecturer: Silja Renooij
Evolutionary computing
Utrecht University The Netherlands
These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html
INFOB2KI 2019-2020
SLIDE 2 What is it?
Evolutionary computing (EC) = the use of evolutionary algorithms (EAs):
- Population‐based, stochastic search algorithms
based on mechanisms of natural evolution.
– Evolution viewed as search algorithm or problem solver – Natural evolution only used as metaphor for computational system design
SLIDE 3 Why the metaphor?
- Ability to efficiently guide a search through
a large solution space
- Ability to adapt solutions to changing
environments
- Goal: design high quality solutions through
“emergent” behavior
SLIDE 4 Darwinian process characteristics
5 key requirements of a Darwinian system:
- 1. Structures carry information
(e.g. DNA)
(next generation)
- 3. Copies partially vary from the original (inheritance)
- 4. Structures compete for limited resources
(Struggle for life selection)
- 5. Relative reproductive success depends on
environment (Survival of the fittest)
There is no evolution without competition.
SLIDE 5
Darwinian characteristics in EC
1. Structures e.g. binary strings, real‐valued vectors, programs,… 2. Structures are copied Selection algorithm: e.g. tournament selection, … 3. Copies partially vary from the original Mutation & crossover operators 4. Structures compete for limited resources New offspring (partly) replaces parents in fixed size population 5. Relative reproductive success depends on environment User‐defined fitness function
SLIDE 6 EC streams
EC unites four traditionally distinct streams, based on historical differences in representation:
– Solutions encoded as discrete structures, e.g. bit‐strings
- Evolution strategies (ES)
– Encoding using real‐valued vectors
- Evolutionary programming (EP)
– Encoding with finite state machines
– Encoding using trees
Nowadays we just use what seems most appropriate.
SLIDE 7 EAs: Main idea
- Take a population of candidate solutions to
a given problem
- Use operators inspired by the mechanisms
- f natural genetic variation
- Apply selective pressure toward certain
properties (what does “fit” mean?)
- Evolve a more fit solution
SLIDE 8 General Scheme of EAs
Based upon replacement strategy
generation
SLIDE 9 Evolutionary terminology
Inspired by biology:
- Individuals: candidate solutions
- Phenotypes: solutions in the real world
- Genotypes: individuals within the EA
- Population:
– (representations of) possible solutions – usually fixed size multiset of genotypes – Diversity of a population refers to the number of different fitness’s / phenotypes / genotypes present (note not the same thing)
- Also: genes, chromosomes, …
SLIDE 10
Design of an Evolutionary Algorithm
SLIDE 11 The Steps - I
In order to build an evolutionary algorithm we have to perform various steps. External to the evolving system:
- Design a representation for individuals:
– A data‐structure – A mapping between phenotype and genotype (encoding and decoding)
- Decide how to initialize a population
- Decide when to stop the algorithm
SLIDE 12 The Steps - II
In order to build an evolutionary algorithm we have to perform various steps. Internal to the evolving system:
- Design a way of evaluating an individual
- Decide how to select parent individuals
- Design suitable recombination operator(s)
- Design suitable mutation operator(s)
- Decide how to select surviving individuals
Variation
SLIDE 13
The external design steps
SLIDE 14 Initialisation
Usually done at random
- Uniformly on search space if possible
– Binary strings: 0 or 1 with probability 0.5 – Real‐valued representations: uniformly on a given interval
Or: seed the population with previous results or those from heuristics.
– Possible loss of genetic diversity – Possible unrecoverable bias
SLIDE 15 Stopping criterion
Termination condition checked every generation
- Reaching some (known/hoped for) fitness
- ptimum
- Limit on CPU resources:
- Reaching some maximum allowed number of
generations
- Reaching some minimum level of diversity
- Limit on user’s patience:
- Reaching some specified number of generations
without fitness improvement
SLIDE 16 Designing a Representation
- Method/ data structure for representing an
individual as a genotype (encoding)
– often takes the form of a bit string in GAs – usually different parts of structure represent different aspects of solution
– the problem to solve – how genotypes will be evaluated – what genetic operators might be used/suitable
SLIDE 17 Discrete Representation
CHROMOSOME GENE
- an individual can be represented using discrete values
– binary; integer; any other system with discrete set of values
Example: binary representation
SLIDE 18 Discrete Representation
- for each genotype there must be a phenotype
Example:
8 bits Genotype Phenotype:
- Integer
- Schedule
- ...
- Anything?
- Real Number
SLIDE 19 Example: genotype to phenotype
- Phenotype could be integer numbers
- Phenotype could be real numbers
Genotype: 1*27 + 0*26 + 1*25 + 0*24 + 0*23 + 0*22 + 1*21 + 1*20 = 128 + 32 + 2 + 1 = 163 = 163 Phenotype:
00 . 14 5 . 2 5 . 20 255 163 5 . 2 x
= 14.00 Genotype: Phenotype: (between 2.5 and 20.5)
SLIDE 20 Example: genotype to phenotype
- Phenotype could be a schedule
- Phenotype could be …
Genotype:
=
1 2 3 4 5 6 7 8 2 1 2 1 1 1 2 2 Job Time Step
Phenotype
SLIDE 21
The internal design steps
SLIDE 22 Evaluating an Individual
– A measure of the goodness of the organism – The more values the fitness function distinguishes the better for selection – Can also be based on results of some external process: e.g. competing models for a game
- This is by far the most costly step for real
applications
- You could use approximate fitness ‐ but not
for too long
SLIDE 23 Parent Selection Mechanism
- Selection pressure: ensure that selection of better
individuals is more probable than of less good individuals drives population forward
- ! less good individuals may still include some
useful genetic material
– Proportionate selection typical textbook method;
not often used in practice
– Rank‐based selection
- Truncation selection
- Tournament selection
SLIDE 24 Fitness proportionate selection I
Selection based on absolute fitness values
- Given a population of N individuals, individual i is
selected with probability
- Selecting again N parents expected number of
times individual i is selected for mating is:
f
SLIDE 25 Fitness proportionate selection II
Best Worst
Better (fitter) individuals have:
more space in population pool more chances to be selected
Disadvantages:
- Danger of premature convergence: (seemingly)
- utstanding individuals take over entire population
very quickly
- Low selection pressure when fitness values are near
each other local optima
SLIDE 26 Truncation selection
- Rank based selection: use relative rather than
absolute fitness
- Individuals are sorted on their fitness value from
best to worse.
- Select the top τ %
- Copy each selected solution 100/τ times
(for fixed population size)
SLIDE 27 Tournament selection
For a tournament of size k:
- Select k random individuals for tournament
(with or without replacement)
- Select only the best individual as parent
Hold N tournaments to get a population of N
SLIDE 28
Variation operators
with examples for discrete structures (GAs)
SLIDE 29 Recombination/Crossover
We can have one or more recombination operator for a representation.
- Mimics biological recombination
– Some portion of genetic material is swapped between chromosomes – Typically the swapping produces offspring
- Offspring inherits parts from each of m ≥ 2 parents
- Recombination should produce m valid chromosomes
- Stochastic:
– n cross‐over point(s) selected randomly – Operator applied with a certain probability
SLIDE 30 1-point cross-over
Whole Population:
. . .
Each chromosome is cut into 2 pieces (at same
cutpoint) which are recombined:
1 1 1 1 1 1 1 0 0 0 0 0 0 0 parents cut cut 1 1 1 0 0 0 0 0 0 0 1 1 1 1
SLIDE 31 n-point & uniform cross-over
- n‐point cross‐over: each chromosome is cut into n+1
pieces which are recombined
- Uniform cross‐over: for each position, swap with a
certain probability
SLIDE 32 Mutation
We can have one or more mutation operators for a representation.
- Mechanism for preserving variety in the population
- At least one mutation operator should allow every
part of the search space to be reached
- Mutation should produce valid chromosome
- Stochastic:
– Alters the structure in a position with some probability – Operator applied with a certain probability
SLIDE 33 Mutation
1 1 1 1 1 1 1
before
1 1 1 0 1 1 1
after Mutation usually happens with small probability pm for each ` gene’ mutated gene
parent
SLIDE 34 Recombination vs Mutation
– Emphasized by Genetic Algorihms – modifications depend on the whole population – decreasing effects with convergence – exploitation operator
– Emphasized by Evolution Strategies and Evolutionary Programming – mandatory to escape local optima – exploration operator
SLIDE 35 Replacement strategy
a.k.a. Survivor selection: Most EAs use fixed population size so need a way of going from parents + offspring to next generation.
- Can use the stochastic selection methods in reverse
- But often deterministic:
– Fitness based: e.g., rank parents + offspring and take best – Age based: select only from offspring (if #offspring = #parents: generational replacement)
- Elitism: regardless of strategy, never replace the best
SLIDE 36
EA design and application
toy example
SLIDE 37 Toy example: design
Consider maximizing
,
- A representation for individuals:
– use binary integer representation:
- Initialization of population
– random
– not relevant for example
- Evaluating an individual:
– use f(x) itself as fitness function
SLIDE 38 Toy example: design cntd
Consider maximizing
,
– size k=2 tournament selection
– 1‐point cross‐over
– standard for bitstrings
– generational replacement
SLIDE 39 Toy example: EA applied
Consider maximizing
,
Example: initial random population:
1) 10010 :
01100 :
01001 :
10100 :
01000 :
00111 :
- Generation 0 population mean fitness
177 f
SLIDE 40 Example – generation 1
Recall: Tournament selection (k =2) Compete:
- string 1 vs string 2, 3 vs 4, 5 vs 6
- string 3 vs string 1, 2 vs 6, 5 vs 4
Parents Fitness 10010 324 10100 400 01000 64 10010 324 01100 144 10100 400
SLIDE 41 Example – generation 1
Recall: 1‐point cross‐over, standard mutation
! : cross‐over point 1/0 : mutated bit
Recall: offspring replaces parents generation 1 population mean fitness
383 f
Parents Fitness Offspring Fitness 100!10 324 10100 400 101!00 400 10111 529 01!000 64 00010 4 10!010 324 10010 324 0110!0 144 11100 784 1010!0 400 10000 256
SLIDE 42 Example – generation 3
! : cross‐over point 1/0 : mutated bit
Generation 3 population mean fitness
762 f
Parents Fitness Offspring Fitness 1!1111 961 11110 900 1!1100 784 11011 729 110!00 576 11110 900 111!10 900 11101 841 1101!1 729 11111 961 1100!1 625 01001 81
SLIDE 43
Solving 8-queens with a GA
SLIDE 44 8-queens representation
- Assume each queen has her own column.
- Represent a board‐configuration by listing the row
number (digits 1 to 8) per queen. Example: the configuration below is represented as 16257483
44
8 7 6 5 4 3 2 1
Q1 Q8
SLIDE 45 8-queens fitness
- Fitness: number of non‐attacking pairs of queens
- There are 7*8/2 = 28 pairs of queens
solutions have fitness of 28 Example: Fitness of the configuration below is 27
45
8 7 6 5 4 3 2 1
Q1 Q8
SLIDE 46 8-queens selection & operators
- Parent Selection: fitness‐based proportionate selection
- Cross‐over: 1‐point, randomly chosen per pair (P1, P2)
- Mutation: random, small independent p for each location
46
- Total fitness in initial population: 24+23+20+11 = 78
- Proportionate Selection: 24/78 = 31%; 23/78 = 29%; etc
P1 P2 C1
SLIDE 47 Diversity due to cross-over
P1 P2 C1
47
SLIDE 48
Designing for permutation problems: order based representations
SLIDE 49 Permutation problems
Permutation problems have different characteristics
- Adjacency, e.g. important aspect for TSP
- Relative order
- Absolute order
- Suitable variation operators are such that
- ffspring of a permutation is a permutation
- 8‐queens is a permutation problem; above
property violated by both cross‐over and mutation in previous example…
SLIDE 50 Recombination for order based representation
e.g. “cut and crossfill”
A D B F C E G H g b h a c d f e d A e B C f g h e A f C d B G H
Parent 1 Parent 2 Child 1 Child 2
Minus EFD Minus abc
- Select random cross‐over point
- Copy left part per parent to offspring
- Fill right side by scanning other parent left to right,
skipping what’s already there
SLIDE 51 Mutation for order based representation
7 8 3 4 1 2 6 5
e.g. “Swap”: Randomly select two different genes and swap them.
7 8 3 4 6 2 1 5
SLIDE 52
Key issues in EC
SLIDE 53
Summary key issues
Exploration vs Exploitation
– Exploration =sample unknown regions – Too much exploration = random search, no convergence – Exploitation = try to improve the best‐so‐far individuals – Too much exploitation = local search only … convergence to a local optimum
SLIDE 54
Summary key issues (2)
Genetic diversity
– differences of genetic characteristics in the population – loss of genetic diversity = all individuals in the population look alike – snowball effect – convergence to the nearest local optimum – in practice, it is irreversible
”An EA is the second best algorithm to any problem”
SLIDE 55 Mating robots
https://www.youtube.com/watch?v=BfcVSb-Q8ns
SLIDE 56
Additional material
(NOT MANDATORY)
SLIDE 57
EA with real-valued representations
SLIDE 58 Real-valued representation
- A very natural encoding if the solution
we are looking for is a list of real-valued numbers, then encode it as a list of real-valued numbers! (i.e., not as a string of 1’s and 0’s)
- Lots of applications, e.g. parameter
- ptimisation
SLIDE 59 Example: Real valued representation, Representation of individuals
- Individuals are represented as a tuple
- f n real-valued numbers:
- The fitness function maps tuples of real
numbers to a single real number:
R x x x x X
i n
,
2 1
M
R R f
n
:
SLIDE 60 Example: Mutation for real valued representation
Perturb values by adding some random noise Often, a Gaussian/normal distribution N(0,) is used, where
- 0 is the mean value
- is the standard deviation
and x’i = xi + N(0,i) for each parameter
SLIDE 61
Example: Recombination for real valued representation
Discrete recombination (uniform crossover): given two parents one child is created as follows a d b f c e g h F D G E H C B A a b C E d H g f
SLIDE 62
Example: Recombination for real valued representation
Intermediate recombination (arithmetic crossover): given two parents one child is created as follows a d b f c e F D E C B A (a+A)/2 (b+B)/2 (c+C)/2 (e+E)/2 (d+D)/2 (f+F)/2
SLIDE 63
EA with tree-based representations
SLIDE 64 Example: Tree-based representation
Individuals in the population are trees. Any S-expression can be drawn as a tree of functions
and terminals.
These functions and terminals can be anything:
Functions: sine, cosine, add, sub, and, If-Then-Else,
Turn...
Terminals: X, Y, 0.456, true, false, , Sensor0…
Example: calculating the area of a circle:
2 * r
* * r r
SLIDE 65 Example: Tree-based representation
Pick a function f at random from the function set F.
This becomes the root node of the tree.
Every function has a fixed number of arguments
(unary, binary, ternary, …. , n-ary), z(f). For each of these arguments, create a node from either the function set F or the terminal set T.
If a terminal is selected then this becomes a leaf If a function is selected, then expand this function
recursively.
A maximum depth is used to make sure the process
stops.
SLIDE 66 Three Methods
The Full grow method ensures that every non-back-
tracking path in the tree is equal to a certain length by allowing only function nodes to be selected for all depths up to the maximum depth - 1, and selecting
- nly terminal nodes at the lowest level.
With the Grow method, we create variable length
paths by allowing a function or terminal to be placed at any level up to the maximum depth - 1. At the lowest level, we can set all nodes to be terminals.
Ramp-half-and-half create trees using a variable
depth from 2 till the maximum depth. For each depth
- f tree, half are created using the Full method, and
the the other half are created using the Grow method.
SLIDE 67
Example: Mutation
* 2 * r r * * r r Single point mutation selects one node and replaces it with a similar one.
SLIDE 68
Example: Recombination *
2 * r r
*
+ r / 1 r Two sub-trees are selected for swapping. * (r + (l / r)) 2 * (r * r )
SLIDE 69
Example: Recombination * + r / 1 r * 2 * r r * * r r * 2 + r / 1 r
Resulting in 2 new expressions