simple on the fly parameter selection
play

Simple On-the-Fly Parameter Selection Carola Doerr CNRS and - PowerPoint PPT Presentation

Simple On-the-Fly Parameter Selection Carola Doerr CNRS and Sorbonne University, Paris, France Markus Wagner University of Adelaide, Australia Presentation at GECCO 2018 Carola Doerr, Markus Wagner: Simple On-the-Fly Parameter Selection


  1. Simple On-the-Fly Parameter Selection Carola Doerr CNRS and Sorbonne University, Paris, France Markus Wagner University of Adelaide, Australia Presentation at GECCO 2018 Carola Doerr, Markus Wagner: Simple On-the-Fly Parameter Selection Mechanisms for Two Classical Discrete Black-Box Optimization Benchmark Problems Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 1

  2. The Parameter Selection Problem  Evolutionary algorithms and related iterative optimization heuristics are parametrized algorithms  Example: � + � EAs  Parameters:  Memory size �  Offspring population size � How shall I set these parameters to  Crossover rate get a well-performing EA?  Mutation rate, search radius, etc  Selective pressure Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 2

  3. Parameter Tuning vs. Parameter Control  Parameter Tuning:  Initial set of experiments  Deduce reasonable parameter settings  Does not have to be done manually, but a number of powerful, ready-to-use tools available: irace, SPOT, ParamILS, SMAC, GGA,…  Parameter Control:  2 main differences:  Parameters are set while optimizing  Parameters change over time : Key motivation: different parameter values can be optimal in different stages of an optimization process Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 3

  4. Goals of Parameter Control  to identify good parameter values “on the fly”  to track good parameter values when they change during the optimization process Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 4

  5. Parameter Control  Example: LeadingOnes: LO(110110101010)=2  Randomized Local search: flip � bits, keep the better of parent and offspring � � ��� (�) = �� � �� Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 5

  6. Parameter Control  Example: LeadingOnes: LO(110110101010)=2  Randomized Local search: flip � bits, keep the better of parent and offspring  n=1000 � = �� � �� Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 6

  7. Parameter Control  Example: LeadingOnes: LO(110110101010)=2  Randomized Local search: flip � bits, keep the better of parent and How can I find/predict such a dependence??? offspring  n=1000 � = �� � �� 22% smaller optimization time Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 7

  8. Good News: You Don’t Have to!  Easy mechanisms which find close-to-optimal parameter values automatically: 1000 optimal mutation strength Avg. mutation strength of Mutation Strength adaptive EA 100 10 1 0 50 100 150 200 250 LO(x) Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 8

  9. Good News: You Don’t Have to!  With close-to-optimal performance: optimal mutation strength 1000 35 x 1000 Avg. mutation strength of adaptive EA 30 Avg. hitting time of dynamic (1+1) EA Avg. hitting time of best static RLS Mutation Strength 25 Avg. Hitting Time 100 Avg. hitting time of best dynamic RLS 20 15 10 10 5 1 0 0 50 100 150 200 250 LO(x) Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 9

  10. Good News: You Don’t Have to!  Running time for update strengths � = 2 , � = 1/2 (empirical)  around 20.5% performance gain over the (1+1) EA �� with static mutation rate � = 1/�  14% performance gain over RLS  larger gains possible for other combinations of � and � Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 10

  11. Success-Based Multiplicative Update Rule Create offspring � through standard bit mutation with mutation probability � A>1 b<1 Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 11

  12. Success-Based Multiplicative Update Rule Standard bit mutation, condition to flip at least one bit A>1 b<1 Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 12

  13. LeadingOnes  Average optimization time for different combinations of � and � (101 independent runs) For comparison: RLS needs � � /2 iterations (=0.5 and =3.125 above),  (1+1) EA >0 needs 0.54 and 3.4 * 10 4 iterations, respectively Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 13

  14. LeadingOnes  Average optimization time for different combinations of � and � (101 independent runs) For comparison: RLS needs � � /2 iterations (=0.5, =3.125, 1.25 above),  (1+1) EA >0 needs 0.54, 3.4 * 10 4 , and 1.35*10 5 iterations, respectively Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 14

  15. LeadingOnes  Average optimization time for different combinations of � and � (101 independent runs) For comparison: RLS needs � � /2 iterations (=1.25*10 5 for � =500),  (1+1) EA >0 needs 1.35*10 5 iterations, respectively Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 15

  16. 1/5-th Success Rules  1/5-th success rule:  originally from continuous optimization [Rechenberg, Devroye, Schumer/Steiglitz] �  (1+1) ES optimizing sphere � � = ∑� !  When success rate > 1/5: increase search radius When success rate < 1/5: decrease search radius  In discrete optimization, e.g., [Kern/Müller/Hansen/Büche/Ocenasek/Koumoutsakos04, Auger09]:  When success rate ≈ 1/5 , parameter value should be stable  In our algorithm: If � � ≥ � � : � ← min ��, + � � ← max{��, 1/� � } else +/1 + since �� 1 = 1  � = 0 � = 1/� 1 Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 16

  17. 1/5-th Success Rules  1/5-th success rule:  originally from continuous optimization [Rechenberg, Devroye, Schumer/Steiglitz] �  (1+1) ES optimizing sphere � � = ∑� !  When success rate > 1/5: increase search radius When success rate < 1/5: decrease search radius  In discrete optimization, e.g., [Kern/Müller/Hansen/Büche/Ocenasek/Koumoutsakos04, Auger09]:  When success rate ≈ 1/5 , parameter value should be stable  In our algorithm: If � � ≥ � � : � ← min ��, + � � ← max{��, 1/� � } else +/1 + since �� 1 = 1  � = 0 � = 1/� 1 Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 17

  18. Results for the 1/5-th Success Rule  LO, � =500, 100 independent runs  RLS performance: 125,000 iterations 125000 120000 115000 110000 105000 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 A Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 18

  19. 1:x Success Rules  A priori no reason the restrict ourselves to a 1:5 success ratio  We can also try different success rules Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 19

  20. Average Optimization Times of 1:x Rules  LO, n=500, 100 independent runs  RLS performance: 125,000 iterations 125000 2 3 4 5 6 7 8 120000 115000 110000 105000 100000 95000 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 A Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 20

  21. Overall Performance Summary  50% of all configurations with 1 < � ≤ 2.5 and 0.4 ≤ � < 1 are better than RLS by at least 13% Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 21

  22. Results for OneMax Average Runtime on OneMax for Different Dimensions 30,000 25,000 Average Optimization Time RLS RLS_opt 20,000 15,000 10,000 5,000 - 100 500 1000 2000 3000 RLS 445 3,050 6,871 14,809 23,814 RLS_opt 436 2,974 6,690 14,722 23,507 Dimension n Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 22

  23. Results for OneMax Average Runtime on OneMax for Different Dimensions 40,000 35,000 (1+1) EA_>0 RLS RLS_opt 30,000 Average Optimization Time 25,000 20,000 15,000 10,000 5,000 - 100 500 1000 2000 3000 (1+1) EA_>0 679 4,756 10,574 24,352 37,256 RLS 445 3,050 6,871 14,809 23,814 RLS_opt 436 2,974 6,690 14,722 23,507 Dimension n Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 23

  24. Results for OneMax Average Runtime on OneMax for Different Dimensions 40,000 (1+1) EA_>0 A=1,11. b=0,66 35,000 Average Optimization Time A=1,2. b=0,85 A=1,3. b=0,75 30,000 A=2,0. b=0,5 RLS 25,000 20,000 RLS_opt 15,000 10,000 5,000 - 100 500 1000 2000 3000 (1+1) EA_>0 679 4,756 10,574 24,352 37,256 A=1,11. b=0,66 447 3,039 6,749 15,134 23,726 A=1,2. b=0,85 450 3,059 6,751 14,801 23,558 A=1,3. b=0,75 450 3,033 6,801 14,974 23,715 A=2,0. b=0,5 455 3,013 6,753 14,613 23,417 RLS 445 3,050 6,871 14,809 23,814 RLS_opt 436 2,974 6,690 14,722 23,507 Dimension n Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 24

  25. Heatmaps for OneMax Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 25

  26. �% Configs better than � + � EA �8 by at least �% 100% 90% 80% 70% % of configurations Even better results if we 60% restrict to configurations with 1 < � ≤ 2.5 and 50% 0.4 ≤ � < 1 40% 30% 100 500 1000 20% 1500 2000 10% 0% 0% 5% 10% 15% 20% 25% 30% 35% 40% % better Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend