Adaptive Operator Selection with Rank-based Multi-Armed Bandits - - PowerPoint PPT Presentation
Adaptive Operator Selection with Rank-based Multi-Armed Bandits - - PowerPoint PPT Presentation
Adaptive Operator Selection with Rank-based Multi-Armed Bandits Alvaro Fialho, Marc Schoenauer & Mich` ele Sebag 26th COW, April 22., 2013 Context Operator Selection Credit Assignment Empirical Validation Conclusion Outline 1 Context
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Outline
1 Context & Motivation 2 Operator Selection 3 Credit Assignment 4 Empirical Validation 5 Conclusions & Further Work
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 2
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Context & Motivation
1 Context & Motivation
Evolutionary Algorithms Adaptive Operator Selection
2 Operator Selection 3 Credit Assignment 4 Empirical Validation 5 Conclusions & Further Work
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 3
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Evolutionary Algorithms
Stochastic optimization algorithms (Darwinian paradigm)
Bottleneck: parameter setting
Population size and number of offspring Selection and replacement methods (and their parameters) Variation Operators (application rate, internal parameters)
Goal: Automatic setting (Crossing the Chasm)
[Moore, 1991] Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 4
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Parameter Setting of Variation Operators
Difficult to predict the performance Problem-dependent and inter-dependent choices
Off-line tuning → best static strategy (expensive)
fitness of the parent 1000 3000 5000 7000 9000 1 2 3 4 5 1-Bit 3-Bit 5-Bit 1/n BitFlip Performance of operators on OneMax
Also depends on . . .
Fitness of the parents
- Pop. fitness distribution
(sample fig. with a (1+50)-EA) Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 5
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Parameter Setting of Variation Operators
Difficult to predict the performance Problem-dependent and inter-dependent choices
Off-line tuning → best static strategy (expensive)
fitness of the parent 1000 3000 5000 7000 9000 1 2 3 4 5 1-Bit 3-Bit 5-Bit 1/n BitFlip Performance of operators on OneMax
Also depends on . . .
Fitness of the parents
- Pop. fitness distribution
(sample fig. with a (1+50)-EA)
⇒ Should be adapted on-line, while solving the problem
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 5
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Adaptive Operator Selection
Position of the Problem
Given a set of K variation operators Select on-line the operator to be applied next Based on their recent effects
Operator Application Impact Evaluation EA AOS Operator Selection Credit Assignment
quality op1 . . . quality op2 quality opk
impact
- perator
credit or reward
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 6
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Operator Selection
1 Context & Motivation 2 Operator Selection
A Multi-Armed Bandit problem Operator Selection: Discussion
3 Credit Assignment 4 Empirical Validation 5 Conclusions & Further Work
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 7
Context Operator Selection Credit Assignment Empirical Validation Conclusion
A (kind of) Multi-Armed Bandit problem
The Basic Multi-Armed Bandit Problem
Given K arms (≡ operators) At time t, gambler plays arm j and gets
rj,t = 1 with (unknown) prob. pj rj,t = 0 with prob. 1 − pj
Goal: maximize cumulative reward ≡ minimize regret L(T) =
T
- t=1
(r∗
t − rt)
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 8
Context Operator Selection Credit Assignment Empirical Validation Conclusion
The Upper Confidence Bound MAB algorithm
Assymptotic optimality guarantees (static context) [Auer et al., 2002] Optimal L(T) = O(log T) At time t, choose arm i maximizing: scorei,t = ˆ qi,t
- exploitation
+
- 2 log
k nk,t
ni,t
- exploration
with ni,t+1 = ni,t + 1
# times
and ˆ qi,t+1 =
- 1 −
1 ni,t+1
- · ˆ
qi,t +
1 ni,t+1 · ri,t
- emp. qual.
Efficiency comes from optimal EvE balance
Interval between exploration trials increases exponentially w.r.t. # time steps
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 9
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Operator Selection with UCB: shortcomings
Exploration vs. Exploitation (EvE) balance
In UCB theory, rewards ∈ {0, 1}; fitness-based rewards ∈ [a, b] UCB’s EvE balance is broken, Scaling is needed: scorei,t = ˆ qi,t + C
- 2 log
k nk,t
ni,t
Dynamical setting (best arm/op changes along evolution)
Adjusting ˆ q’s after a change takes a long time Use change detection test (e.g. Page-Hinkley)
[Hinkley, 1969]
⇒ Upon the detection of a change, restart the MAB.
DMAB = UCB + Scaling + Page-Hinkley
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 10
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Operator Selection: Discussion
MAB = UCB + Scaling
Optimal EvE, but in static setting. . . AOS is dynamic
DMAB = MAB + Page-Hinkley change-detection
Won Pascal challenge on On-line EvE trade-off
[Hartland et al., 2007]
Utilization in the AOS context
[GECCO’08]
2 hyper-parameters: scaling C and Page-Hinkley threshold γ Very efficient, but very sensitive to hyper-parameter setting Change-detection works only when changes are abrupt An alternative: ’More Dynamic’ Reward
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 11
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Credit Assignment
1 Context & Motivation 2 Operator Selection 3 Credit Assignment
Fitness-based Rewards Area-Under-the-Curve (AUC) Rank-based AUC with MAB
4 Empirical Validation 5 Conclusions & Further Work
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 12
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Fitness-based Rewards
Impact of an operator application?
Most common: Fitness Improvement ∆F For multi-modal problems: diversity also important
[CEC’09]
From Impact to Credit (or reward)
Instantaneous (∆F last application)
likely to be unstable
Average of the last W applications Extreme value over the last W applications
[PPSN’08]
Rare extreme events are more important than average e.g. rogue waves, epidemic propagation
Issues: High sensitivity to scaling parameters
. . . likely to be dynamic, too Higher robustness: Credit Assignment based on Ranks
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 13
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Area-Under-the-Curve (AUC)
Area Under ROC Curve in ML
Evaluation of binary classifiers
[Fawcett, 2006]
[ + + - - + + + - - - - . . . ]
Performance: % of misclassification Equivalent to MannWhitneyWilcoxon test Pr (rank(n+) > rank(n−))
Area Under ROC Curve in AOS
One operator versus others
[GECCO’10]
[ op1, op2, op1, op1, op1, op2, op2, . . .] Fitness improvements are ranked Size of the segment = assigned rank-value
1 2 3 4 5 6 1 2 3 4 5 6 7 8 9
- perator under assessment (1)
- ther operators
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 14
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Rank-Based AUC
R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8
- Op. 2, Step 0
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Rank-Based AUC
R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8
1
- Op. 2, Step 1
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Rank-Based AUC
R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8
2
- Op. 2, Step 2
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Rank-Based AUC
R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8
2 1
- Op. 2, Step 3
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Rank-Based AUC
R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8
2 2
- Op. 2, Step 4
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Rank-Based AUC
R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8
8 7
- Op. 2, Step 15
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Rank-Based AUC
R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8
1 2 3 4 5 6 7 8 2 4 6 8 10 12 14
- Op. 0: 1.39
- Op. 1: 31.94
- Op. 2: 61.11
- Op. 3: 5.56
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Rank-Based AUC
R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8
2 4 6 8 10 12 14 16 2 4 6 8 10 12 14
- riginal AUC in ML: equal widths
1 2 3 4 5 6 7 8 2 4 6 8 10 12 14
- Op. 0: 1.39
- Op. 1: 31.94
- Op. 2: 61.11
- Op. 3: 5.56
segments with same size
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Rank-Based AUC
R ∆F Op 1 5.0 2 2 4.7 2 3 4.2 1 4 3.5 1 5 3.4 2 6 3.3 2 7 3.1 2 8 3.0 2 9 2.9 2 10 2.8 2 11 2.5 3 12 2.0 1 13 1.5 14 1.0 3 15 0.8
2 4 6 8 10 12 14 16 2 4 6 8 10 12 14
- riginal AUC in ML: equal widths
exponential D=0.5: (D^R).(W-R) 1 2 3 4 5 6 7 8 2 4 6 8 10 12 14
- Op. 0: 1.39
- Op. 1: 31.94
- Op. 2: 61.11
- Op. 3: 5.56
segments with same size
5 10 15 20 25 5 10 15 20 25 30
- Op. 0: 0.00
- Op. 1: 5.36
- Op. 2: 94.64
- Op. 3: 0.00
exponential decay DR(W − R)
(example with D = 0.5) ⇒
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 15
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Rank-based AUC with MAB
Rationale
AUC: behavior of all ops.: dynamic by construction AUC is already an aggregation: ⇒ directly use AUC in UCB: scorej,t = AUCj,t + C ·
- 2 log
k nk,t
nj,t
Area-Under-Curve (AUC)
Ranks over fitness improvements (∆F) Invariant w.r.t. linear scaling of F
Fitness-based AUC (FAUC)
Ranks over fitness values (F), rather than ranks over ∆F Invariant w.r.t monotonous transformations of F → Comparison-based AOS
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 16
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Empirical Validation
1 Context & Motivation 2 Operator Selection 3 Credit Assignment 4 Empirical Validation
Goals of Experiments (1+50)-EA on the OneMax Problem DE on BBOB continuous Benchmarks
5 Conclusions & Further Work
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 17
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Goals of Experiments
Given a set of K operators . . .
Performance ?
Baseline methods
1
Each operator being applied alone
2
Naive uniform selection between operators
3
Static off-line tuning of application rates (cost ≫)
4
Optimal behavior (available only on simple benchmarks)
5
State-of-the-art OS method: Adaptive Pursuit
[Thierens, 2005]
Robustness/Generality ?
AOS methods have hyper-parameters Tuned off-line by F-RACE
[Birattari et al., 2002]
Robustness w.r.t. hyper-parameter setting Generality w.r.t. different problems/landscapes
Invariance properties
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 18
Context Operator Selection Credit Assignment Empirical Validation Conclusion
The OneMax Problem
104 bits Fitness: # of “1”s (1+50)-GA 4 mutation
- perators
fitness of the parent 1000 3000 5000 7000 9000 1 2 3 4 5 1-Bit 3-Bit 5-Bit 1/n BitFlip Performance of operators on OneMax
Generations 1000 2000 3000 4000 5000 1 0.5
Fitness 1-Bit 3-Bit 5-Bit 1/n BitFlip Changes
Optimal Operator Selection (Oracle)
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 19
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Comparative Results
1000 2000 3000 4000 5000 1 0.5 Extreme - Adaptive Pursuit 1000 2000 3000 4000 5000 1 0.5 Extreme - Dynamic Multi-Armed Bandit 1000 2000 3000 4000 5000 1 0.5 Area-Under-Curve - Bandit
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 20
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Monotonous Transformations of the Fitness
Original OneMax: F = n
i=1 bi
3 monotonous transformations: log(F), exp (F) and F2
(h-l) F = bi log(F) exp(F) F 2 AOS tech. 485 5103/427 5195/430 5562/950 5588/950 AUC-MAB 807 5123/218 5431/223 5930/334 5792/382 Ext-AP 5726/399 5726/399 5726/399 5726/399 FAUC-MAB 2591 5376/285 7967/718 7722/2151 6138/516 Ext-DMAB 6971 6059/667 8863/694 13030/3053 12136/949 Ext-SLMAB 7052 9044/840 7947/1267 14999/0 14999/0 Ext-MAB
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 21
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Monotonous Transformations of the Fitness
Original OneMax: F = n
i=1 bi
3 monotonous transformations: log(F), exp (F) and F2
(h-l) F = bi log(F) exp(F) F 2 AOS tech. 485 5103/427 5195/430 5562/950 5588/950 AUC-MAB 807 5123/218 5431/223 5930/334 5792/382 Ext-AP 5726/399 5726/399 5726/399 5726/399 FAUC-MAB 2591 5376/285 7967/718 7722/2151 6138/516 Ext-DMAB 6971 6059/667 8863/694 13030/3053 12136/949 Ext-SLMAB 7052 9044/840 7947/1267 14999/0 14999/0 Ext-MAB
Other (artificial) scenarios
Binary: Long K-Path, Royal Road, . . . Combinatorial: SAT Continuous: BBOB
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 21
Context Operator Selection Credit Assignment Empirical Validation Conclusion
DE on BBOB continuous Benchmarks
- Exp. framework for rigorous benchmarking
[Hansen et al., 2010]
24 continuous functions, 15 instances per function Several problem dimensions (2, 3, 5, 10, 20, 40)
Adaptive Operator Selection in Differential Evolution
A completely different evolutionary algorithm
[Storn and Price, 1995]
NP = 100 · DIM; CR = 1.0; F = 0.5 With 4 possible mutation strategies
rand/1, rand/2, rand-to-best/2, current-to-rand/1
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 22
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Parwise comparisons of FAUC-Bandit with . . . (sample fig)
- 2
- 1
1 2 log10 of FEvals(A1)/FEvals(A0) proportion f6-9 DE1: 4/4 DE2: 4/3 DE3: 4/4 DE4: 4/0
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 23
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Parwise comparisons of FAUC-Bandit with . . .
- 2
- 1
1 2 log10 of FEvals(A1)/FEvals(A0) proportion f1-24 DE1: 15/15 DE2: 15/12 DE3: 15/15 DE4: 15/0
(a) all functions
- 2
- 1
1 2 log10 of FEvals(A1)/FEvals(A0) proportion f1-5 DE1: 3/3 DE2: 3/3 DE3: 3/3 DE4: 3/0
(b) separable functions
- 2
- 1
1 2 log10 of FEvals(A1)/FEvals(A0) proportion f6-9 DE1: 4/4 DE2: 4/3 DE3: 4/4 DE4: 4/0
(c) moderate functions
- 2
- 1
1 2 log10 of FEvals(A1)/FEvals(A0) proportion f10-14 DE1: 5/5 DE2: 5/5 DE3: 5/5 DE4: 5/0
(d) ill-conditioned functions
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 24
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Parwise comparisons of FAUC-Bandit with . . .
- 2
- 1
1 2 log10 of FEvals(A1)/FEvals(A0) proportion f1-24 pm: 15/15 AP: 15/15 DMAB: 15/15 SLMAB: 15/15 MAB: 15/15
(e) all functions
- 2
- 1
1 2 log10 of FEvals(A1)/FEvals(A0) proportion f1-5 pm: 3/3 AP: 3/3 DMAB: 3/3 SLMAB: 3/3 MAB: 3/3
(f) separable functions
- 2
- 1
1 2 log10 of FEvals(A1)/FEvals(A0) proportion f6-9 pm: 4/4 AP: 4/4 DMAB: 4/4 SLMAB: 4/4 MAB: 4/4
(g) moderate functions
- 2
- 1
1 2 log10 of FEvals(A1)/FEvals(A0) proportion f10-14 pm: 5/5 AP: 5/5 DMAB: 5/5 SLMAB: 5/5 MAB: 5/5
(h) ill-conditioned functions
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 25
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Parwise comparisons of FAUC-Bandit with . . .
- 2
- 1
1 2 log10 of FEvals(A1)/FEvals(A0) proportion f1-24 Naive: 15/15 StAll: 15/15 StEach: 15/15 CMA: 15/19
(i) all functions
- 2
- 1
1 2 log10 of FEvals(A1)/FEvals(A0) proportion f1-5 Naive: 3/3 StAll: 3/3 StEach: 3/3 CMA: 3/3
(j) separable functions
- 2
- 1
1 2 log10 of FEvals(A1)/FEvals(A0) proportion f6-9 Naive: 4/4 StAll: 4/4 StEach: 4/4 CMA: 4/4
(k) moderate functions
- 2
- 1
1 2 log10 of FEvals(A1)/FEvals(A0) proportion f10-14 Naive: 5/5 StAll: 5/5 StEach: 5/5 CMA: 5/5
(l) ill-conditioned functions
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 26
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Conclusions & Further Work
1 Context & Motivation 2 Operator Selection 3 Credit Assignment 4 Empirical Validation 5 Conclusions & Further Work
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 27
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Conclusions
Algorithmic Contributions
Operator Selection
MAB = UCB + Scaling DMAB = MAB + Page-Hinkley test
[GECCO’08]
Credit Assignment
Extreme value-based (∆F)
[PPSN’08]
Rank-based methods
[GECCO’10]
AOS Combinations
Extreme-xMAB: efficient, but sensitive w.r.t. hyper-parameters (F)AUC-MAB: efficient and robust w.r.t. hyper-parameters
FAUC: comparison-based
⇒ Combining concepts from ML: MABs and AUC ⇒ Extending them to a dynamic context
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 28
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Conclusions (2)
Empirical Validation
(performance, robustness and generality)
Genetic Algorithms
Artificial scenarios
[GECCO’08, AMAI’10, GECCO’10]
Boolean problems
[PPSN’08, LION’09, GECCO’09, AMAI’10, GECCO’10]
OneMax, Long K-Path and Royal Road problems
Memetic Algorithms
SAT problems, with the Compass Credit Assign.
[CEC’09, Chapter’10]
A highly multimodal context
Differential Evolution
Continuous problems
[BBOB’10, PPSN’10] Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 29
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Some Perspectives for Further Work (from 12/2010!)
Application extensions: AOS paradigm is very general
Use within other meta-heuristics Use at the level of hyper-heuristics
Cross-domain Heuristic Search Challenge (CHeSC)
Algorithmic extensions: towards real-world problems
Extend to multi-modal (diversity, pop.size, . . . ) Extend to multi-objective (Pareto, hyper-volume, . . . )
First trial in real-world: sustainable development
Optimization of designs of buildings for energy efficiency
Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 30
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Our Publications I
Da Costa, L., Fialho, A., Schoenauer, M., and Sebag, M. (2008). Adaptive operator selection with dynamic multi-armed bandits. In Proc. Genetic and Evolutionary Computation Conference (GECCO). ACM. Fialho, A., Da Costa, L., Schoenauer, M., and Sebag, M. (2008). Extreme value based adaptive operator selection. In Proc. Intl. Conf. on Parallel Problem Solving from Nature (PPSN). Springer. Fialho, A., Da Costa, L., Schoenauer, M., and Sebag, M. (2009). Dynamic multi-armed bandits and extreme value-based rewards for AOS in evolutionary algorithms. In Proc. Intl. Conf. on Learning and Intelligent Optimization (LION). Springer. Maturana, J., Fialho, A., Saubion, F., Schoenauer, M., and Sebag, M. (2009). Extreme compass and dynamic multi-armed bandits for adaptive operator selection. In Proc. IEEE Congress on Evolutionary Computation (CEC). IEEE. Fialho, A., Schoenauer, M., and Sebag, M. (2009). Analysis of adaptive operator selection techniques on the royal road and long k-path problems. In Proc. Genetic and Evolutionary Computation Conference (GECCO). ACM. Maturana, J., Fialho, A., Saubion, F., Schoenauer, M., Lardeux, F., and Sebag, M. (2010). Adaptive operator selection and management in evolutionary algorithms. In Y. Hamadi et al, editor, Autonomous Search. Springer. (to appear) Fialho, A., Da Costa, L., Schoenauer, M., and Sebag, M. (2010). Analyzing bandit-based adaptive operator selection mechanisms. Annals of Mathematics and A. I. – Special Issue on Learning and Intelligent Optimization. Springer. Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 31
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Our Publications II
Fialho, A., Schoenauer, M., and Sebag, M. (2010). Toward comparison-based adaptive operator selection. In Proc. Genetic and Evolutionary Computation Conference (GECCO). ACM. Gong, W., Fialho, A., and Cai, Z. (2010). Adaptive strategy selection in differential evolution. In Proc. Genetic and Evolutionary Computation Conference (GECCO). ACM. Fialho, A., Schoenauer, M., and Sebag, M. (2010). Fitness-AUC bandit adaptive strategy selection vs. the probability matching one within DE. In Black-Box Optimization Benchmarking Workshop (BBOB-GECCO). ACM. Fialho, A., Gong, W., and Cai, Z. (2010). Probability matching-based adaptive strategy selection vs. uniform strategy selection within DE. In Black-Box Optimization Benchmarking Workshop (BBOB-GECCO). ACM. Fialho, A. and Ros, R. (2010). Analysis of adaptive strategy selection within differential evolution on the BBOB-2010 noiseless benchmark. Research Report RR-7259, INRIA. Fialho, A., Ros, R., Schoenauer, M., and Sebag, M. (2010). Comparison-based adaptive strategy selection in differential evolution. In Proc. Intl. Conf. on Parallel Problem Solving from Nature (PPSN). Springer. Li, K., Fialho, A., and Kwong, S. (2011). Multi-objective differential evolution with adaptive control of parameters and operators. In Proc. Intl. Conf. on Learning and Intelligent Optimization (LION). Springer. (to appear) Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 32
Adaptive Operator Selection with Rank-based Multi-Armed Bandits
Alvaro Fialho, Marc Schoenauer & Mich` ele Sebag 26th COW, April 22., 2013
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Other References I
Auer, P., Cesa-Bianchi, N., and Fischer, P. (2002). Finite-time analysis of the multi-armed bandit problem. Machine Learning, 47(2-3):235–256. Birattari, M., St¨ utzle, T., Paquete, L., and Varrentrapp, K. (2002). A racing algorithm for configuring metaheuristics. In W.B. Langdon et al., editor, Proc. Genetic and Evolutionary Computation Conference (GECCO), pages 11–18. Morgan Kaufmann. Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recogn. Lett., 27(8):861–874. Hansen, N., Auger, A., Finck, S., and Ros, R. (2010). Real-parameter black-box optimization benchmarking 2010: Experimental setup. Technical Report RR-7215, INRIA. Hartland, C., Baskiotis, N., Gelly, S., Teytaud, O., and Sebag, M. (2007). Change point detection and meta-bandits for online learning in dynamic environments. In Proc. Conf´ erence Francophone sur l’Apprentissage Automatique (CAPS). Hinkley, D. (1969). Inference about the change point in a sequence of random variables. Biometrika, 57(1):1–17. Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 34
Context Operator Selection Credit Assignment Empirical Validation Conclusion
Other References II
Moore, G. (1991). Crossing the Chasm: Marketing and Selling High-Tech Products to Mainstream Customer. Collins Business Essentials. Storn, R. and Price, K. (1995). Differential evolution - a simple and efficient adaptive scheme for global optimization over continuous spaces. Technical Report TR-95-012, Intl. Computer Science Institute. Thierens, D. (2005). An adaptive pursuit strategy for allocating operator probabilities. In H.-G. Beyer et al., editor, Proc. Genetic and Evolutionary Computation Conference (GECCO), pages 1539–1546. ACM. Fialho, Schoenauer, Sebag Rank-based Adaptive Operator Selection 35