Adaptive Operator Selection with Rank-based Multi-Armed Bandits - - PowerPoint PPT Presentation

adaptive operator selection with rank based multi armed
SMART_READER_LITE
LIVE PREVIEW

Adaptive Operator Selection with Rank-based Multi-Armed Bandits - - PowerPoint PPT Presentation

Adaptive Operator Selection with Rank-based Multi-Armed Bandits Alvaro Fialho, Marc Schoenauer & Mich` ele Sebag 26th COW, April 22., 2013 Context Operator Selection Credit Assignment Empirical Validation Conclusion Outline 1 Context


slide-1
SLIDE 1

Adaptive Operator Selection with Rank-based Multi-Armed Bandits

Alvaro Fialho, Marc Schoenauer & Mich` ele Sebag 26th COW, April 22., 2013

slide-2
SLIDE 2

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Outline

1 Context & Motivation 2 Operator Selection 3 Credit Assignment 4 Empirical Validation 5 Conclusions & Further Work

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 2

slide-3
SLIDE 3

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Context & Motivation

1 Context & Motivation

Evolutionary Algorithms Adaptive Operator Selection

2 Operator Selection 3 Credit Assignment 4 Empirical Validation 5 Conclusions & Further Work

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 3

slide-4
SLIDE 4

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Evolutionary Algorithms

Stochastic optimization algorithms (Darwinian paradigm)

Bottleneck: parameter setting

Population size and number of offspring Selection and replacement methods (and their parameters) Variation Operators (application rate, internal parameters)

Goal: Automatic setting (Crossing the Chasm)

[Moore, 1991] Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 4

slide-5
SLIDE 5

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Parameter Setting of Variation Operators

Difficult to predict the performance Problem-dependent and inter-dependent choices

Off-line tuning → best static strategy (expensive)

fitness of the parent 1000 3000 5000 7000 9000 1 2 3 4 5 1-Bit 3-Bit 5-Bit 1/n BitFlip Performance of operators on OneMax

Also depends on . . .

Fitness of the parents

  • Pop. fitness distribution

(sample fig. with a (1+50)-EA) Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 5

slide-6
SLIDE 6

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Parameter Setting of Variation Operators

Difficult to predict the performance Problem-dependent and inter-dependent choices

Off-line tuning → best static strategy (expensive)

fitness of the parent 1000 3000 5000 7000 9000 1 2 3 4 5 1-Bit 3-Bit 5-Bit 1/n BitFlip Performance of operators on OneMax

Also depends on . . .

Fitness of the parents

  • Pop. fitness distribution

(sample fig. with a (1+50)-EA)

⇒ Should be adapted on-line, while solving the problem

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 5

slide-7
SLIDE 7

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Adaptive Operator Selection

Position of the Problem

Given a set of K variation operators Select on-line the operator to be applied next Based on their recent effects

Operator Application Impact Evaluation EA AOS Operator Selection Credit Assignment

quality op1 . . . quality op2 quality opk

impact

  • perator

credit or reward

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 6

slide-8
SLIDE 8

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Operator Selection

1 Context & Motivation 2 Operator Selection

A Multi-Armed Bandit problem Operator Selection: Discussion

3 Credit Assignment 4 Empirical Validation 5 Conclusions & Further Work

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 7

slide-9
SLIDE 9

Context Operator Selection Credit Assignment Empirical Validation Conclusion

A (kind of) Multi-Armed Bandit problem

The Basic Multi-Armed Bandit Problem

Given K arms (≡ operators) At time t, gambler plays arm j and gets

rj,t = 1 with (unknown) prob. pj rj,t = 0 with prob. 1 − pj

Goal: maximize cumulative reward ≡ minimize regret L(T) =

T

  • t=1

(r∗

t − rt)

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 8

slide-10
SLIDE 10

Context Operator Selection Credit Assignment Empirical Validation Conclusion

The Upper Confidence Bound MAB algorithm

Assymptotic optimality guarantees (static context) [Auer et al., 2002] Optimal L(T) = O(log T) At time t, choose arm i maximizing: scorei,t = ˆ qi,t

  • exploitation

+

  • 2 log

k nk,t

ni,t

  • exploration

with ni,t+1 = ni,t + 1

# times

and ˆ qi,t+1 =

  • 1 −

1 ni,t+1

  • · ˆ

qi,t +

1 ni,t+1 · ri,t

  • emp. qual.

Efficiency comes from optimal EvE balance

Interval between exploration trials increases exponentially w.r.t. # time steps

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 9

slide-11
SLIDE 11

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Operator Selection with UCB: shortcomings

Exploration vs. Exploitation (EvE) balance

In UCB theory, rewards ∈ {0, 1}; fitness-based rewards ∈ [a, b] UCB’s EvE balance is broken, Scaling is needed: scorei,t = ˆ qi,t + C

  • 2 log

k nk,t

ni,t

Dynamical setting (best arm/op changes along evolution)

Adjusting ˆ q’s after a change takes a long time Use change detection test (e.g. Page-Hinkley)

[Hinkley, 1969]

⇒ Upon the detection of a change, restart the MAB.

DMAB = UCB + Scaling + Page-Hinkley

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 10

slide-12
SLIDE 12

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Operator Selection: Discussion

MAB = UCB + Scaling

Optimal EvE, but in static setting. . . AOS is dynamic

DMAB = MAB + Page-Hinkley change-detection

Won Pascal challenge on On-line EvE trade-off

[Hartland et al., 2007]

Utilization in the AOS context

[GECCO’08]

2 hyper-parameters: scaling C and Page-Hinkley threshold γ Very efficient, but very sensitive to hyper-parameter setting Change-detection works only when changes are abrupt An alternative: ’More Dynamic’ Reward

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 11

slide-13
SLIDE 13

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Credit Assignment

1 Context & Motivation 2 Operator Selection 3 Credit Assignment

Fitness-based Rewards Area-Under-the-Curve (AUC) Rank-based AUC with MAB

4 Empirical Validation 5 Conclusions & Further Work

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 12

slide-14
SLIDE 14

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Fitness-based Rewards

Impact of an operator application?

Most common: Fitness Improvement ∆F For multi-modal problems: diversity also important

[CEC’09]

From Impact to Credit (or reward)

Instantaneous (∆F last application)

likely to be unstable

Average of the last W applications Extreme value over the last W applications

[PPSN’08]

Rare extreme events are more important than average e.g. rogue waves, epidemic propagation

Issues: High sensitivity to scaling parameters

. . . likely to be dynamic, too Higher robustness: Credit Assignment based on Ranks

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 13

slide-15
SLIDE 15

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Area-Under-the-Curve (AUC)

Area Under ROC Curve in ML

Evaluation of binary classifiers

[Fawcett, 2006]

[ + + - - + + + - - - - . . . ]

Performance: % of misclassification Equivalent to MannWhitneyWilcoxon test Pr (rank(n+) > rank(n−))

Area Under ROC Curve in AOS

One operator versus others

[GECCO’10]

[ op1, op2, op1, op1, op1, op2, op2, . . .] Fitness improvements are ranked Size of the segment = assigned rank-value

1 2 3 4 5 6 1 2 3 4 5 6 7 8 9

  • perator under assessment (1)
  • ther operators

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 14

slide-16
SLIDE 16

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Rank-Based AUC

R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8

  • Op. 2, Step 0

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15

slide-17
SLIDE 17

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Rank-Based AUC

R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8

1

  • Op. 2, Step 1

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15

slide-18
SLIDE 18

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Rank-Based AUC

R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8

2

  • Op. 2, Step 2

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15

slide-19
SLIDE 19

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Rank-Based AUC

R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8

2 1

  • Op. 2, Step 3

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15

slide-20
SLIDE 20

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Rank-Based AUC

R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8

2 2

  • Op. 2, Step 4

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15

slide-21
SLIDE 21

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Rank-Based AUC

R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8

8 7

  • Op. 2, Step 15

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15

slide-22
SLIDE 22

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Rank-Based AUC

R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8

1 2 3 4 5 6 7 8 2 4 6 8 10 12 14

  • Op. 0: 1.39
  • Op. 1: 31.94
  • Op. 2: 61.11
  • Op. 3: 5.56

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15

slide-23
SLIDE 23

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Rank-Based AUC

R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8

2 4 6 8 10 12 14 16 2 4 6 8 10 12 14

  • riginal AUC in ML: equal widths

1 2 3 4 5 6 7 8 2 4 6 8 10 12 14

  • Op. 0: 1.39
  • Op. 1: 31.94
  • Op. 2: 61.11
  • Op. 3: 5.56

segments with same size

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15

slide-24
SLIDE 24

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Rank-Based AUC

R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8

2 4 6 8 10 12 14 16 2 4 6 8 10 12 14

  • riginal AUC in ML: equal widths

exponential D=0.5: (D^R).(W-R) 1 2 3 4 5 6 7 8 2 4 6 8 10 12 14

  • Op. 0: 1.39
  • Op. 1: 31.94
  • Op. 2: 61.11
  • Op. 3: 5.56

segments with same size

5 10 15 20 25 5 10 15 20 25 30

  • Op. 0: 0.00
  • Op. 1: 5.36
  • Op. 2: 94.64
  • Op. 3: 0.00

exponential decay DR(W − R)

(example with D = 0.5) ⇒

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15

slide-25
SLIDE 25

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Rank-based AUC with MAB

Rationale

AUC: behavior of all ops.: dynamic by construction AUC is already an aggregation: ⇒ directly use AUC in UCB: scorej,t = AUCj,t + C ·

  • 2 log

k nk,t

nj,t

Area-Under-Curve (AUC)

Ranks over fitness improvements (∆F) Invariant w.r.t. linear scaling of F

Fitness-based AUC (FAUC)

Ranks over fitness values (F), rather than ranks over ∆F Invariant w.r.t monotonous transformations of F → Comparison-based AOS

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 16

slide-26
SLIDE 26

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Empirical Validation

1 Context & Motivation 2 Operator Selection 3 Credit Assignment 4 Empirical Validation

Goals of Experiments (1+50)-EA on the OneMax Problem DE on BBOB continuous Benchmarks

5 Conclusions & Further Work

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 17

slide-27
SLIDE 27

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Goals of Experiments

Given a set of K operators . . .

Performance ?

Baseline methods

1

Each operator being applied alone

2

Naive uniform selection between operators

3

Static off-line tuning of application rates (cost ≫)

4

Optimal behavior (available only on simple benchmarks)

5

State-of-the-art OS method: Adaptive Pursuit

[Thierens, 2005]

Robustness/Generality ?

AOS methods have hyper-parameters Tuned off-line by F-RACE

[Birattari et al., 2002]

Robustness w.r.t. hyper-parameter setting Generality w.r.t. different problems/landscapes

Invariance properties

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 18

slide-28
SLIDE 28

Context Operator Selection Credit Assignment Empirical Validation Conclusion

The OneMax Problem

104 bits Fitness: # of “1”s (1+50)-GA 4 mutation

  • perators

fitness of the parent 1000 3000 5000 7000 9000 1 2 3 4 5 1-Bit 3-Bit 5-Bit 1/n BitFlip Performance of operators on OneMax

Generations 1000 2000 3000 4000 5000 1 0.5

Fitness 1-Bit 3-Bit 5-Bit 1/n BitFlip Changes

Optimal Operator Selection (Oracle)

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 19

slide-29
SLIDE 29

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Comparative Results

1000 2000 3000 4000 5000 1 0.5 Extreme - Adaptive Pursuit 1000 2000 3000 4000 5000 1 0.5 Extreme - Dynamic Multi-Armed Bandit 1000 2000 3000 4000 5000 1 0.5 Area-Under-Curve - Bandit

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 20

slide-30
SLIDE 30

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Monotonous Transformations of the Fitness

Original OneMax: F = n

i=1 bi

3 monotonous transformations: log(F), exp (F) and F2

(h-l) F = bi log(F) exp(F) F 2 AOS tech. 485 5103/427 5195/430 5562/950 5588/950 AUC-MAB 807 5123/218 5431/223 5930/334 5792/382 Ext-AP 5726/399 5726/399 5726/399 5726/399 FAUC-MAB 2591 5376/285 7967/718 7722/2151 6138/516 Ext-DMAB 6971 6059/667 8863/694 13030/3053 12136/949 Ext-SLMAB 7052 9044/840 7947/1267 14999/0 14999/0 Ext-MAB

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 21

slide-31
SLIDE 31

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Monotonous Transformations of the Fitness

Original OneMax: F = n

i=1 bi

3 monotonous transformations: log(F), exp (F) and F2

(h-l) F = bi log(F) exp(F) F 2 AOS tech. 485 5103/427 5195/430 5562/950 5588/950 AUC-MAB 807 5123/218 5431/223 5930/334 5792/382 Ext-AP 5726/399 5726/399 5726/399 5726/399 FAUC-MAB 2591 5376/285 7967/718 7722/2151 6138/516 Ext-DMAB 6971 6059/667 8863/694 13030/3053 12136/949 Ext-SLMAB 7052 9044/840 7947/1267 14999/0 14999/0 Ext-MAB

Other (artificial) scenarios

Binary: Long K-Path, Royal Road, . . . Combinatorial: SAT Continuous: BBOB

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 21

slide-32
SLIDE 32

Context Operator Selection Credit Assignment Empirical Validation Conclusion

DE on BBOB continuous Benchmarks

  • Exp. framework for rigorous benchmarking

[Hansen et al., 2010]

24 continuous functions, 15 instances per function Several problem dimensions (2, 3, 5, 10, 20, 40)

Adaptive Operator Selection in Differential Evolution

A completely different evolutionary algorithm

[Storn and Price, 1995]

NP = 100 · DIM; CR = 1.0; F = 0.5 With 4 possible mutation strategies

rand/1, rand/2, rand-to-best/2, current-to-rand/1

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 22

slide-33
SLIDE 33

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Parwise comparisons of FAUC-Bandit with . . . (sample fig)

  • 2
  • 1

1 2 log10 of FEvals(A1)/FEvals(A0) proportion f6-9 DE1: 4/4 DE2: 4/3 DE3: 4/4 DE4: 4/0

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 23

slide-34
SLIDE 34

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Parwise comparisons of FAUC-Bandit with . . .

  • 2
  • 1

1 2 log10 of FEvals(A1)/FEvals(A0) proportion f1-24 DE1: 15/15 DE2: 15/12 DE3: 15/15 DE4: 15/0

(a) all functions

  • 2
  • 1

1 2 log10 of FEvals(A1)/FEvals(A0) proportion f1-5 DE1: 3/3 DE2: 3/3 DE3: 3/3 DE4: 3/0

(b) separable functions

  • 2
  • 1

1 2 log10 of FEvals(A1)/FEvals(A0) proportion f6-9 DE1: 4/4 DE2: 4/3 DE3: 4/4 DE4: 4/0

(c) moderate functions

  • 2
  • 1

1 2 log10 of FEvals(A1)/FEvals(A0) proportion f10-14 DE1: 5/5 DE2: 5/5 DE3: 5/5 DE4: 5/0

(d) ill-conditioned functions

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 24

slide-35
SLIDE 35

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Parwise comparisons of FAUC-Bandit with . . .

  • 2
  • 1

1 2 log10 of FEvals(A1)/FEvals(A0) proportion f1-24 pm: 15/15 AP: 15/15 DMAB: 15/15 SLMAB: 15/15 MAB: 15/15

(e) all functions

  • 2
  • 1

1 2 log10 of FEvals(A1)/FEvals(A0) proportion f1-5 pm: 3/3 AP: 3/3 DMAB: 3/3 SLMAB: 3/3 MAB: 3/3

(f) separable functions

  • 2
  • 1

1 2 log10 of FEvals(A1)/FEvals(A0) proportion f6-9 pm: 4/4 AP: 4/4 DMAB: 4/4 SLMAB: 4/4 MAB: 4/4

(g) moderate functions

  • 2
  • 1

1 2 log10 of FEvals(A1)/FEvals(A0) proportion f10-14 pm: 5/5 AP: 5/5 DMAB: 5/5 SLMAB: 5/5 MAB: 5/5

(h) ill-conditioned functions

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 25

slide-36
SLIDE 36

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Parwise comparisons of FAUC-Bandit with . . .

  • 2
  • 1

1 2 log10 of FEvals(A1)/FEvals(A0) proportion f1-24 Naive: 15/15 StAll: 15/15 StEach: 15/15 CMA: 15/19

(i) all functions

  • 2
  • 1

1 2 log10 of FEvals(A1)/FEvals(A0) proportion f1-5 Naive: 3/3 StAll: 3/3 StEach: 3/3 CMA: 3/3

(j) separable functions

  • 2
  • 1

1 2 log10 of FEvals(A1)/FEvals(A0) proportion f6-9 Naive: 4/4 StAll: 4/4 StEach: 4/4 CMA: 4/4

(k) moderate functions

  • 2
  • 1

1 2 log10 of FEvals(A1)/FEvals(A0) proportion f10-14 Naive: 5/5 StAll: 5/5 StEach: 5/5 CMA: 5/5

(l) ill-conditioned functions

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 26

slide-37
SLIDE 37

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Conclusions & Further Work

1 Context & Motivation 2 Operator Selection 3 Credit Assignment 4 Empirical Validation 5 Conclusions & Further Work

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 27

slide-38
SLIDE 38

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Conclusions

Algorithmic Contributions

Operator Selection

MAB = UCB + Scaling DMAB = MAB + Page-Hinkley test

[GECCO’08]

Credit Assignment

Extreme value-based (∆F)

[PPSN’08]

Rank-based methods

[GECCO’10]

AOS Combinations

Extreme-xMAB: efficient, but sensitive w.r.t. hyper-parameters (F)AUC-MAB: efficient and robust w.r.t. hyper-parameters

FAUC: comparison-based

⇒ Combining concepts from ML: MABs and AUC ⇒ Extending them to a dynamic context

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 28

slide-39
SLIDE 39

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Conclusions (2)

Empirical Validation

(performance, robustness and generality)

Genetic Algorithms

Artificial scenarios

[GECCO’08, AMAI’10, GECCO’10]

Boolean problems

[PPSN’08, LION’09, GECCO’09, AMAI’10, GECCO’10]

OneMax, Long K-Path and Royal Road problems

Memetic Algorithms

SAT problems, with the Compass Credit Assign.

[CEC’09, Chapter’10]

A highly multimodal context

Differential Evolution

Continuous problems

[BBOB’10, PPSN’10] Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 29

slide-40
SLIDE 40

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Some Perspectives for Further Work (from 12/2010!)

Application extensions: AOS paradigm is very general

Use within other meta-heuristics Use at the level of hyper-heuristics

Cross-domain Heuristic Search Challenge (CHeSC)

Algorithmic extensions: towards real-world problems

Extend to multi-modal (diversity, pop.size, . . . ) Extend to multi-objective (Pareto, hyper-volume, . . . )

First trial in real-world: sustainable development

Optimization of designs of buildings for energy efficiency

Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 30

slide-41
SLIDE 41

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Our Publications I

Da Costa, L., Fialho, A., Schoenauer, M., and Sebag, M. (2008). Adaptive operator selection with dynamic multi-armed bandits. In Proc. Genetic and Evolutionary Computation Conference (GECCO). ACM. Fialho, A., Da Costa, L., Schoenauer, M., and Sebag, M. (2008). Extreme value based adaptive operator selection. In Proc. Intl. Conf. on Parallel Problem Solving from Nature (PPSN). Springer. Fialho, A., Da Costa, L., Schoenauer, M., and Sebag, M. (2009). Dynamic multi-armed bandits and extreme value-based rewards for AOS in evolutionary algorithms. In Proc. Intl. Conf. on Learning and Intelligent Optimization (LION). Springer. Maturana, J., Fialho, A., Saubion, F., Schoenauer, M., and Sebag, M. (2009). Extreme compass and dynamic multi-armed bandits for adaptive operator selection. In Proc. IEEE Congress on Evolutionary Computation (CEC). IEEE. Fialho, A., Schoenauer, M., and Sebag, M. (2009). Analysis of adaptive operator selection techniques on the royal road and long k-path problems. In Proc. Genetic and Evolutionary Computation Conference (GECCO). ACM. Maturana, J., Fialho, A., Saubion, F., Schoenauer, M., Lardeux, F., and Sebag, M. (2010). Adaptive operator selection and management in evolutionary algorithms. In Y. Hamadi et al, editor, Autonomous Search. Springer. (to appear) Fialho, A., Da Costa, L., Schoenauer, M., and Sebag, M. (2010). Analyzing bandit-based adaptive operator selection mechanisms. Annals of Mathematics and A. I. – Special Issue on Learning and Intelligent Optimization. Springer. Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 31

slide-42
SLIDE 42

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Our Publications II

Fialho, A., Schoenauer, M., and Sebag, M. (2010). Toward comparison-based adaptive operator selection. In Proc. Genetic and Evolutionary Computation Conference (GECCO). ACM. Gong, W., Fialho, A., and Cai, Z. (2010). Adaptive strategy selection in differential evolution. In Proc. Genetic and Evolutionary Computation Conference (GECCO). ACM. Fialho, A., Schoenauer, M., and Sebag, M. (2010). Fitness-AUC bandit adaptive strategy selection vs. the probability matching one within DE. In Black-Box Optimization Benchmarking Workshop (BBOB-GECCO). ACM. Fialho, A., Gong, W., and Cai, Z. (2010). Probability matching-based adaptive strategy selection vs. uniform strategy selection within DE. In Black-Box Optimization Benchmarking Workshop (BBOB-GECCO). ACM. Fialho, A. and Ros, R. (2010). Analysis of adaptive strategy selection within differential evolution on the BBOB-2010 noiseless benchmark. Research Report RR-7259, INRIA. Fialho, A., Ros, R., Schoenauer, M., and Sebag, M. (2010). Comparison-based adaptive strategy selection in differential evolution. In Proc. Intl. Conf. on Parallel Problem Solving from Nature (PPSN). Springer. Li, K., Fialho, A., and Kwong, S. (2011). Multi-objective differential evolution with adaptive control of parameters and operators. In Proc. Intl. Conf. on Learning and Intelligent Optimization (LION). Springer. (to appear) Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 32

slide-43
SLIDE 43

Adaptive Operator Selection with Rank-based Multi-Armed Bandits

Alvaro Fialho, Marc Schoenauer & Mich` ele Sebag 26th COW, April 22., 2013

slide-44
SLIDE 44

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Other References I

Auer, P., Cesa-Bianchi, N., and Fischer, P. (2002). Finite-time analysis of the multi-armed bandit problem. Machine Learning, 47(2-3):235–256. Birattari, M., St¨ utzle, T., Paquete, L., and Varrentrapp, K. (2002). A racing algorithm for configuring metaheuristics. In W.B. Langdon et al., editor, Proc. Genetic and Evolutionary Computation Conference (GECCO), pages 11–18. Morgan Kaufmann. Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recogn. Lett., 27(8):861–874. Hansen, N., Auger, A., Finck, S., and Ros, R. (2010). Real-parameter black-box optimization benchmarking 2010: Experimental setup. Technical Report RR-7215, INRIA. Hartland, C., Baskiotis, N., Gelly, S., Teytaud, O., and Sebag, M. (2007). Change point detection and meta-bandits for online learning in dynamic environments. In Proc. Conf´ erence Francophone sur l’Apprentissage Automatique (CAPS). Hinkley, D. (1969). Inference about the change point in a sequence of random variables. Biometrika, 57(1):1–17. Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 34

slide-45
SLIDE 45

Context Operator Selection Credit Assignment Empirical Validation Conclusion

Other References II

Moore, G. (1991). Crossing the Chasm: Marketing and Selling High-Tech Products to Mainstream Customer. Collins Business Essentials. Storn, R. and Price, K. (1995). Differential evolution - a simple and efficient adaptive scheme for global optimization over continuous spaces. Technical Report TR-95-012, Intl. Computer Science Institute. Thierens, D. (2005). An adaptive pursuit strategy for allocating operator probabilities. In H.-G. Beyer et al., editor, Proc. Genetic and Evolutionary Computation Conference (GECCO), pages 1539–1546. ACM. Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 35