Machine Learning: Algorithms and Applications Floriano Zini Free - - PDF document

machine learning algorithms and applications
SMART_READER_LITE
LIVE PREVIEW

Machine Learning: Algorithms and Applications Floriano Zini Free - - PDF document

19/03/12 Machine Learning: Algorithms and Applications Floriano Zini Free University of Bozen-Bolzano Faculty of Computer Science Academic Year 2011-2012 Lecture 4: 19 th March 2012 Evolutionary computing These slides are mainly taken from


slide-1
SLIDE 1

19/03/12 1

Machine Learning: Algorithms and Applications

Floriano Zini Free University of Bozen-Bolzano Faculty of Computer Science Academic Year 2011-2012 Lecture 4: 19th March 2012

Evolutionary computing

These slides are mainly taken from A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing

slide-2
SLIDE 2

19/03/12 2

The Main EC Metaphor

EVOLUTION Environment Individual Fitness PROBLEM SOLVING Problem Candidate Solution Quality

Quality → chance for seeding new solutions Fitness → chances for survival and reproduction Fitness in nature: observed, 2ndary, EC: primary

Motivations for EC

› Developing, analyzing, applying problem solving

methods a.k.a. algorithms is a central theme in mathematics and computer science

› Time for thorough problem analysis decreases › Complexity of problems to be solved increases › Consequence: ROBUST PROBLEM SOLVING

technology needed

slide-3
SLIDE 3

19/03/12 3

Evolutionary machine learning

› We have corresponding sets of inputs &

  • utputs and seek model that delivers correct
  • utput for every known input

Modelling example: load applicant creditibility

› British bank evolved

creditability model to predict loan paying behavior of new applicants

› Evolving: prediction

models

› Fitness: model accuracy

  • n historical data
slide-4
SLIDE 4

19/03/12 4

Modelling example: mushroom edibility

› Classify mushrooms as

edible or not edible

› Evolving: classifications

models

› Fitness: classification

accuracy on training set of edible and not edible mushrooms

EC metaphor

› A population of individuals exists in an environment

with limited resources

› Competition for those resources causes selection of

those fitter individuals that are better adapted to the environment

› These individuals act as seeds for the generation of

new individuals through recombination and mutation

› The new individuals have their fitness evaluated and

compete (possibly also with parents) for survival.

› Over time Natural selection causes a rise in the

fitness of the population

slide-5
SLIDE 5

19/03/12 5

General scheme of EAs

Population Parents

Parent selection Survivor selection

Offspring

Recombination (crossover) Mutation

Intialization Termination

EA scheme in pseudo-code

slide-6
SLIDE 6

19/03/12 6

Main EA components

› Representation › Population › Evaluation › Selection (parent selection, survivor selection) › Variation (mutation, recombination) › Initialization › Termination condition Genotype space Phenotype space Encoding (representation) Decoding (inverse representation)

B 0 c 0 1 c d G 0 c 0 1 c d R 0 c 0 1 c d

Representation

In order to find the global optimum, every feasible solution must be represented in genotype space

slide-7
SLIDE 7

19/03/12 7

Population

› Role: holds the candidate solutions of the

problem as individuals (genotypes)

› Formally, a population is a multiset of individuals,

i.e. repetitions are possible

› Population is the basic unit of evolution, i.e., the

population is evolving, not the individuals

› Selection operators act on population level › Variation operators act on individual level

Evaluation (fitness) function

› A.k.a. quality function or objective function › Role:

› Represents the task to solve, the requirements to adapt to

(can be seen as “the environment”)

› enables selection (provides basis for comparison)

› e.g., some phenotypic traits are advantageous, desirable, e.g.

big ears cool better, these traits are rewarded by more offspring that will expectedly carry the same trait › Assigns a single real-valued fitness to each phenotype

which forms the basis for selection

› So the more discrimination (different values) the better

› Typically we talk about fitness being maximised

› Some problems may be best posed as minimisation

problems, but conversion is trivial

slide-8
SLIDE 8

19/03/12 8

Selection

Role:

› Identifies individuals

› to become parents › to survive

› Pushes population towards higher fitness › Usually probabilistic

› high quality solutions more likely to be selected than low

quality

› but not guaranteed › even worst in current population usually has non-zero

probability of being selected

› This stochastic nature can aid escape from local

  • ptima

Example: roulette wheel selection fitness(A) = 3 fitness(B) = 1 fitness(C) = 2

A C

1/6 = 17% 3/6 = 50%

B

2/6 = 33%

Selection mechanism example

In principle, any selection mechanism can be used for parent selection as well as for survivor selection

slide-9
SLIDE 9

19/03/12 9

Survivor selection

› A.k.a. replacement › Most EAs use fixed population size so need a way

  • f going from (parents + offspring) to next

generation

› Often deterministic (while parent selection is

usually stochastic)

› Fitness based : e.g., rank parents+offspring and take

best

› Age based: make as many offspring as parents and

delete all parents

› Sometimes a combination of stochastic and

deterministic (elitism)

Variation operators

› Role: to generate new candidate solutions › Usually divided into two types according to their

arity (number of inputs):

› Arity 1 : mutation operators › Arity >1 : recombination operators

› Arity = 2 typically called crossover › Arity > 2 is formally possible, seldomly used in EC

› There has been much debate about relative

importance of recombination and mutation

› Nowadays most EAs use both › Variation operators must match the given representation

slide-10
SLIDE 10

19/03/12 10

Mutation

› Role: causes small, random variance › Acts on one genotype and delivers another › Element of randomness is essential and differentiates it from

  • ther unary heuristic operators

before

1 1 1 0 1 1 1

after

1 1 1 1 1 1 1

Recombination

› Role: merges information from parents into offspring › Choice of what information to merge is stochastic › Most offspring may be worse, or the same as the parents › Hope is that some are better by combining elements of

genotypes that lead to good traits

1 1 1 1 1 1 1 0 0 0 0 0 0 0

Parents

cut cut

Offspring

1 1 1 0 0 0 0 0 0 0 1 1 1 1

slide-11
SLIDE 11

19/03/12 11

Initialisation / Termination

› Initialisation usually done at random

› Need to ensure even spread and mixture of possible allele

values

› Can include existing solutions, or use problem-specific

heuristics, to “seed” the population

› Termination condition checked every generation

› Reaching some (known/hoped for) fitness › Reaching some maximum allowed number of generations › Reaching some minimum level of diversity › Reaching some specified number of generations without

fitness improvement

Place 8 queens on an 8x8 chessboard in such a way that they cannot check each other

Example: the 8-queens problem

slide-12
SLIDE 12

19/03/12 12

1 2 3 4 5 6 7 8 Genotype: a permutation of the numbers 1 - 8 Phenotype: a board configuration

Obvious mapping

The 8-queens problem: representation The 8-queens problem: fitness evaluation

› Penalty of one queen: the number of queens

she can check

› Penalty of a configuration: the sum of

penalties of all queens

› Note: penalty is to be minimized › Fitness of a configuration: inverse penalty to

be maximized

slide-13
SLIDE 13

19/03/12 13

Small variation in one permutation, e.g.:

  • swapping values of two randomly chosen positions,

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

The 8-queens problem: mutation

Combining two permutations into two new permutations:

  • choose random crossover point
  • copy first parts into children
  • create second part by inserting values from other parent:
  • in the order they appear there
  • beginning after crossover point
  • skipping values already in child

8 7 6 4 2 5 3 1 1 3 5 2 4 6 7 8 8 7 6 4 5 1 2 3 1 3 5 6 2 8 7 4 The 8-queens problem: recombination

slide-14
SLIDE 14

19/03/12 14

The 8-queens problem: selection

› Parent selection:

› Pick 5 parents and take best two to undergo

crossover

› Survivor selection (replacement)

› When inserting a new child into the population,

choose an existing member to replace by:

› sorting the whole population by decreasing fitness › enumerating this list from high to low › replacing the first with a fitness lower than the given

child

8 Queens Problem: summary

Note that is only one possible set of choices of operators and parameters

slide-15
SLIDE 15

19/03/12 15

Typical behavior of an EA

Stages in optimizing on a 1-dimensional fitness landscape Early stage: quasi-random population distribution Mid-stage: population arranged around/on hills Late stage: population concentrated on high hills

Typical run: progression of fitness

Typical run of an EA shows so-called “anytime behavior”

Best fitness in population Time (number of generations)

slide-16
SLIDE 16

19/03/12 16

Best fitness in population Time (number of generations) Progress in 1st half Progress in 2nd half

Typical run: progression of fitness

› Are long runs beneficial?

› It depends on how much you want the last bit of progress › May be better to do more short runs

T: time needed to reach level F after random initialisation

T

Time (number of generations) Best fitness in population F: fitness after smart initialisation

F

  • Answer: it depends.
  • Possibly good, if good solutions/methods exist
  • Care is needed

Is it worth expending effort on smart initialization?

slide-17
SLIDE 17

19/03/12 17

Scale of “all” problems

Performance of methods on problems Random search

Special, problem tailored method Evolutionary algorithm

EAs as problem solvers: Goldberg view (1989)

Scale of “all” problems

P Performance of methods on problems EA 1 EA 4 EA 3 EA 2

EAs as problem solvers: Michalewicz view (1996)

slide-18
SLIDE 18

19/03/12 18

Genetic Algorithms GA Quick Overview

› Developed: USA in the 1970’s › Early names: J. Holland, K. DeJong, D. Goldberg › Holland’s original GA is now known as the

simple genetic algorithm (SGA)

› Other GAs use different:

› Representations › Mutations › Crossovers › Selection mechanisms

slide-19
SLIDE 19

19/03/12 19

Representation SGA technical summary tableau

Representation Binary Strings Recombination N-point or uniform Mutation Bitwise bit-flipping with fixed probability Parent selection Fitness-Proportionate Survivor selection All children replace parents Speciality Emphasis on crossover

slide-20
SLIDE 20

19/03/12 20

SGA reproduction cycle

  • 1. Select parents for the mating pool

(size of mating pool = population size)

  • 2. Shuffle the mating pool
  • 3. Apply crossover for each consecutive pair

with probability pc, otherwise copy parents

  • 4. Apply mutation for each offspring (bit-flip

with probability pm independently for each bit)

  • 5. Replace the whole population with the

resulting offspring

SGA operators: 1-point crossover

› Choose a random point on the two parents › Split parents at this crossover point › Create children by exchanging tails › Pc typically in range (0.6, 0.9)

slide-21
SLIDE 21

19/03/12 21

SGA operators: mutation

› Alter each gene independently with a

probability pm

› pm is called the mutation rate

› Typically between 1/pop_size and

1/chromosome_length

SGA operators: Selection

Main idea: better individuals get higher chance

› Chances proportional to fitness › Implementation: roulette wheel technique

› Assign to each individual a part of the roulette wheel › Spin the wheel n times to select n individuals

A

3/5 = 50 %

B

2/6 = 33%

C

1/6 = 17%

Fitness(A) 3 Fitness(B) 2 Fitness(C) 1

slide-22
SLIDE 22

19/03/12 22

An example after Goldberg ‘89 (1)

› Simple problem: max x2 over {0,1,…,31} › GA approach:

› Representation: binary code, e.g., 01101 ↔ 13 › Population size: 4 › 1-point xover, bitwise mutation › Roulette wheel selection › Random initialization

› We show one generational cycle done by hand

X2 example: selection

slide-23
SLIDE 23

19/03/12 23

X2 example: Crossover X2 example: Mutation

slide-24
SLIDE 24

19/03/12 24

The simple GA

› Has been subject of many (early) studies

› still often used as benchmark for novel GAs

› Shows many shortcomings, e.g.,

› Representation is too restrictive › Mutation & crossover operators only applicable for bit-

string & integer representations

› Selection mechanism sensitive for converging

populations with close fitness values

› Generational population model (step 5 in SGA repr.

cycle) can be improved with explicit survivor selection

Alternative Crossover Operators

› Performance with 1 Point Crossover depends on

the order that variables occur in the representation

› more likely to keep together genes that are near each

  • ther

› Can never keep together genes from opposite ends of

string

› This is known as Positional Bias › Can be exploited if we know about the structure of our

problem, but this is not usually the case

slide-25
SLIDE 25

19/03/12 25

n-point crossover

› Choose n random crossover points › Split along those points › Glue parts, alternating between parents › Generalisation of 1 point (still some positional bias)

Uniform crossover

› Assign 'heads' to one parent, 'tails' to the other › Flip a coin for each gene of the first child › Make an inverse copy of the gene for the second child › Inheritance is independent of position

slide-26
SLIDE 26

19/03/12 26

Crossover OR mutation?

› Decade long debate: which one is better /

necessary / main-background

› Answer (at least, rather wide agreement):

› it depends on the problem, but › in general, it is good to have both › both have another role › mutation-only-EA is possible, xover-only-EA would not

work

Crossover OR mutation? (cont’d)

Exploration: Discovering promising areas in the search space, i.e. gaining information on the problem Exploitation: Optimising within a promising area, i.e. using information There is co-operation AND competition between them

› Crossover is explorative, it makes a big jump to an

area somewhere “in between” two (parent) areas

› Mutation is exploitative, it creates random small

diversions, thereby staying near (in the area of ) the parent

slide-27
SLIDE 27

19/03/12 27

Crossover OR mutation? (cont’d)

› Only crossover can combine information from two

parents

› Only mutation can introduce new information

(alleles)

› To hit the optimum you often need a ‘lucky’

mutation