Adaptive Operator Selection for Optimization Alvaro Fialho - PowerPoint PPT Presentation

Adaptive Operator Selection for Optimization ´ Alvaro Fialho Advisors: Marc Schoenauer & Mich` ele Sebag Ph.D. Defense ´ Ecole Doctorale d’Informatique Universit´ e Paris-Sud, Orsay, France December 22, 2010

Context Operator Selection Credit Assignment Empirical Validation Conclusion Outline Context & Motivation 1 Operator Selection 2 Credit Assignment 3 Empirical Validation 4 Conclusions & Further Work 5 ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 2/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion Context & Motivation Context & Motivation 1 Evolutionary Algorithms Parameter Setting in EAs Parameter Setting of Variation Operators Adaptive Operator Selection Operator Selection 2 Credit Assignment 3 Empirical Validation 4 Conclusions & Further Work 5 ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 3/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion Evolutionary Algorithms Stochastic optimization algorithms (Darwinian paradigm) Bottleneck: parameter setting Population size and number of offspring generated Parameters of selection and replacement methods Parameters of Variation Operators (application rate, etc) Goal: Automatic parameter setting (Crossing the Chasm) ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 4/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion Parameter Setting in EAs [Eiben et al., 2007] ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 5/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion Parameter Setting of Variation Operators Difficult to predict the performance Problem-dependent and inter-dependent choices Off-line tuning can find the best static strategy (expensive) Performance of operators on OneMax 5 1-Bit 3-Bit 4 5-Bit 1/n BitFlip Depends also on... 3 Fitness of the parents 2 Pop. fitness distribution 1 (sample fig. with a (1+50)-EA) 0 1000 3000 5000 7000 9000 fitness of the parent ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 6/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion Parameter Setting of Variation Operators Difficult to predict the performance Problem-dependent and inter-dependent choices Off-line tuning can find the best static strategy (expensive) Performance of operators on OneMax 5 1-Bit 3-Bit 4 5-Bit 1/n BitFlip Depends also on... 3 Fitness of the parents 2 Pop. fitness distribution 1 (sample fig. with a (1+50)-EA) 0 1000 3000 5000 7000 9000 fitness of the parent ⇒ Should be adapted on-line, while solving the problem ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 6/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion Parameter Setting in EAs [Eiben et al., 2007] ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 7/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion Adaptive Operator Selection Position of the Problem Given a set of K variation operators Select on-line the operator to be applied next Based on their recent performance EA AOS quality op1 Operator credit or operator Operator quality op2 reward Application Selection . . . quality opk Impact Credit impact Evaluation Assignment ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 8/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion Operator Selection Context & Motivation 1 Operator Selection 2 Related Work Discussion on Operator Selection A (kind of) Multi-Armed Bandit problem Dynamic Multi-Armed Bandit (DMAB) Sliding Multi-Armed Bandit (SLMAB) Contributions to Operator Selection: Summary Credit Assignment 3 Empirical Validation 4 Conclusions & Further Work 5 ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 9/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion Operator Selection - Related Work “Empirical quality”: ˆ q j , t +1 = (1 − α ) · ˆ q j , t + α · r j , t Probability Matching (PM) [Goldberg, 1990] s i proportional to ˆ q i ˆ q i , t +1 s i , t +1 = p min + (1 − K · p min ) · � K j =1 ˆ q j , t +1 Adaptive Pursuit (AP) [Thierens, 2005] s i ∗ is pushed to p max ; others to p min i ∗ = argmax { ˆ q i , t , i = 1 . . . K } s i ∗ , t +1 = s i ∗ , t + β · ( p max − s i ∗ , t ) , = s i , t + β · ( p min − s i , t ) , for i � = i ∗ s i , t +1 ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 10/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion Discussion on Operator Selection Exploration versus Exploitation In operators search space, not problem search space Acquire new information (use other operators) vs. Capitalize on the available knowledge (use current best) Probability-based Methods (PM and AP) Conservative approach: fixed p min Entails over-exploration when many operators EvE balance ⇒ Game Theory: Multi-Armed Bandits Level of exploration depends on confidence about knowledge i.e. , p min should be “dynamic” ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 11/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion A (kind of) Multi-Armed Bandit problem Original Multi-Armed Bandits (Machine Learning - ML) Given K arms ( ≡ operators) At time t , gambler plays arm j and gets r j , t = 1 with (unknown) prob. p j r j , t = 0 otherwise Goal : maximize cumulative reward ≡ minimize regret T � ( r ∗ L ( T ) = t − r t ) t =1 ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 12/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion The Upper Confidence Bound MAB algorithm Assymptotic optimality guarantees (static context) [Auer et al., 2002] Optimal L ( T ) = O (log T ) At time t , choose arm i maximizing: � 2 log � k n k , t = ˆ + score i , t q i , t n i , t �� exploitation � �� exploration with n i , t +1 = n i , t + 1 # times � � 1 1 and ˆ q i , t +1 = 1 − · ˆ q i , t + n i , t +1 · r i , t emp. qual. n i , t +1 Efficiency comes from optimal EvE balance Interval between exploration trials increases exponentially w.r.t. # time steps ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 13/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion Operator Selection with UCB: shortcomings Exploration vs. Exploitation (EvE) balance Original MAB: rewards ∈ { 0 , 1 } ; AOS: rewards ∈ [ a , b ] (e.g., fitness improvement) UCB’s EvE balance is broken, Scaling is needed: � 2 log P k n k , t score i , t = ˆ q i , t + C n i , t Dynamics When op i is not the best anymore . . . � � 1 1 q i , t +1 = ˆ 1 − · ˆ q i , t + n i , t +1 · r i , t n i , t +1 Weight of r is inversely proportional to n Adjusting ˆ q ’s after a change takes a long time ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 14/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion Dynamic Multi-Armed Bandit (DMAB) Rationale No need for exploration in stationary situations ⇒ Upon the detection of a change, restart the MAB. How to detect a change in a distribution? Page-Hinkley statistical test [Page, 1954] r t = 1 P t ¯ i =1 r i 1 t m t = P t i =1 ( r i − ¯ r i + δ ), 2 M t = max {| m i | , i = 1 . . . t } 3 Return ( M t − | m t | > γ ) 4 ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 15/46

Context Operator Selection Credit Assignment Empirical Validation Conclusion Dynamic Multi-Armed Bandit (DMAB) Rationale No need for exploration in stationary situations ⇒ Upon the detection of a change, restart the MAB. How to detect a change in a distribution? Page-Hinkley statistical test [Page, 1954] r t = 1 P t ¯ i =1 r i 1 t m t = P t i =1 ( r i − ¯ r i + δ ), 2 M t = max {| m i | , i = 1 . . . t } 3 Return ( M t − | m t | > γ ) 4 DMAB = UCB + Scaling + Page-Hinkley ´ Alvaro Fialho – Ph.D. Defense – December 22, 2010 Adaptive Operator Selection for Optimization 15/46

Adaptive Operator Selection for Optimization Alvaro Fialho - PowerPoint PPT Presentation

Adaptive Operator Selection for Optimization Alvaro Fialho Advisors: Marc Schoenauer & Mich` ele Sebag Ph.D. Defense Ecole Doctorale dInformatique Universit e Paris-Sud, Orsay, France December 22, 2010 Context Operator

Demo (Step 1, Selection) Demo (Step 1, Optimization) Demo (Step 2, Selection) Demo (Step 2,

Neural Nets for Adaptive Filter and Adaptive Neural Nets as Adaptive Filters Pattern Recognition

Adaptive Control Chapter 1: Introduction to Adaptive Control Adaptive Control Landau, Lozano,

Adaptive Control Chapter 11: Direct Adaptive Control 1 Adaptive Control Landau, Lozano,

Adaptive Control Chapter 12: Indirect Adaptive Control 1 Adaptive Control Landau, Lozano,

Adaptive Operator Selection with Rank-based Multi-Armed Bandits Alvaro Fialho, Marc Schoenauer

Shuffle algebra perspective on operator valued probability theory 30 mars 2020 1/25 Operator

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Optimization of the Poisson Operator Optimization of the Poisson Operator in Chombo in Chombo

Adaptive Control Chapter 13: Multimodel adaptive control with switching Chapter 13: Multimodel

Adaptive Control Chapter 14: Adaptive regulation Rejection of unknown disturbances 1

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Selection 2 Selection Selection given a set of (distinct) elements, finding the element larger

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Flavour Steve King La Thuile, 22nd March, 2019 Electroweak Scale Flavour Problem t u d c e

PROGRAM GUIDE Day 1: Thursday, June 28, 2012 8:30-9:00 Welcome 9:00-10:00 Keynote Speech

ProbabilityandStatistics* ! forComputerScience** Who!discovered!this?! ! n 1 +

Cyber Intrusion Detection by Using Deep Neural Networks with Attack-sharing Loss IEEE DataCom

New computational methods for NLO and NNLO calculations in QCD Stefan Weinzierl Institut fr

A New Construction of QAM Golay Complementary Sequence Pair 2020 IEEE International Symposium on

Aaron Pulkka 2 Evolving International Partnerships from Outsourcing to Exporting Aaron

LEARNING SESSION 2: TEAMWORK & ENGAGEMENT: THE HEART OF HEALTHCARE Dr. Lawrence Yang ,

Adaptive Operator Selection for Optimization Alvaro Fialho - PowerPoint PPT Presentation

Adaptive Operator Selection for Optimization Alvaro Fialho Advisors: Marc Schoenauer & Mich` ele Sebag Ph.D. Defense Ecole Doctorale dInformatique Universit e Paris-Sud, Orsay, France December 22, 2010 Context Operator

Demo (Step 1, Selection) Demo (Step 1, Optimization) Demo (Step 2, Selection) Demo (Step 2,

Neural Nets for Adaptive Filter and Adaptive Neural Nets as Adaptive Filters Pattern Recognition

Adaptive Control Chapter 1: Introduction to Adaptive Control Adaptive Control Landau, Lozano,

Adaptive Control Chapter 11: Direct Adaptive Control 1 Adaptive Control Landau, Lozano,

Adaptive Control Chapter 12: Indirect Adaptive Control 1 Adaptive Control Landau, Lozano,

Adaptive Operator Selection with Rank-based Multi-Armed Bandits Alvaro Fialho, Marc Schoenauer

Shuffle algebra perspective on operator valued probability theory 30 mars 2020 1/25 Operator

ERP Selection KIRTANE &amp; PANDIT Suhas Deshpande Why ERP Selection is important ?

Optimization of the Poisson Operator Optimization of the Poisson Operator in Chombo in Chombo

Adaptive Control Chapter 13: Multimodel adaptive control with switching Chapter 13: Multimodel

Adaptive Control Chapter 14: Adaptive regulation Rejection of unknown disturbances 1

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Selection 2 Selection Selection given a set of (distinct) elements, finding the element larger

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Flavour Steve King La Thuile, 22nd March, 2019 Electroweak Scale Flavour Problem t u d c e

PROGRAM GUIDE Day 1: Thursday, June 28, 2012 8:30-9:00 Welcome 9:00-10:00 Keynote Speech

Probability*and*Statistics* ! for*Computer*Science** Who!discovered!this?! ! n 1 +

Cyber Intrusion Detection by Using Deep Neural Networks with Attack-sharing Loss IEEE DataCom

New computational methods for NLO and NNLO calculations in QCD Stefan Weinzierl Institut fr

A New Construction of QAM Golay Complementary Sequence Pair 2020 IEEE International Symposium on

Aaron Pulkka 2 Evolving International Partnerships from Outsourcing to Exporting Aaron

LEARNING SESSION 2: TEAMWORK &amp; ENGAGEMENT: THE HEART OF HEALTHCARE Dr. Lawrence Yang ,

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

ProbabilityandStatistics* ! forComputerScience** Who!discovered!this?! ! n 1 +

LEARNING SESSION 2: TEAMWORK & ENGAGEMENT: THE HEART OF HEALTHCARE Dr. Lawrence Yang ,