ATM Algorithm on the BBOB 2009 Noiseless Function Testbed Benjamin - PowerPoint PPT Presentation

Benchmarking The ATM Algorithm on the BBOB 2009 Noiseless Function Testbed Benjamin Bodner Brown University Providence, RI, USA BBOB Workshop GECCO 2019 Prague 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 1 of 34

Content 01 03 Introduction Results Motivation BBOB Noiseless Intuition BBOB Large-scale Internal runtime 02 04 Main Components Summary Parameters & main equations Recent progress Parameter adaptation Goals moving forward Resource allocation Conclusions 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 2 of 34

Motivation Deep Learning Growing need for • Physical Sciences optimization methods for very high-dimensional settings Optimization Algorithms Problems commonly • have 10^5- 10^8 optimizable variables [Devlin et al. 2019] Image from GOMC: https://gomc- wsu.github.io/Manual/index.html Image from: https://towardsdatascience.com/why- deep-learning-is-needed-over-traditional- machine-learning-1b6a99177063 12/7/2019 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 3 of 34

Motivation Gradient-based optimization methods can create many difficulties Noise Vanishing gradients [Shalev-Shwartz et al. 2017] Image from: https://towardsdatascience.com/gradient-descent- algorithm-and-its-variants-10f652806a3 Architecture Getting stuck in Deep Design local minima Learning Image from [He et al. 2015]), Hyperparameter Current ways of Regularization tuning mitigating these issues Do not always work [Sutskever 2013] Image from: 12/7/2019 4 of 34 Srivastava, Nitish, et al. 2014 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed

Motivation Interacting Particles Physical Sciences Image from GOMC: • Functions are non-convex https://gomc-wsu.github.io/Manual/index.html • Notoriously have large Protein Folding numbers of local minima [Nichita 2002] Image from: https://en.wikibooks.org/wiki/Structural_Bioch emistry/Proteins/Protein_Folding_Problem • Simulated annealing and quasi-Newton methods can be slow • Do not always converge to the global minima [Hao et al. 2015] Image by Thomas Splettstoesser: https://www.behance.net/gallery/10952399/Protein- Folding-Funnel 12/7/2019 5 of 34 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed

Motivation Characteristics intentionally designed into the BBOB function testbeds Covariance matrices and Existing algorithms have Hessians limit their scalability been highly successful capabilities in these settings [BIPOP CMA-ES, Hansen 2009] Key components and operations are usually of order D^2 Images from: Finck, Hansen, Ros, Auger 2015 12/7/2019 6 of 34 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed

Proposal Eliminate the use of D^2 objects and operations Adaptive Two Mode (ATM) Algorithm A black box optimization algorithm which only maintains objects and executes operations of order D 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 7 of 34

The Adaptive Two Mode Algorithm Uses a combination of two kinds of search distributions / “modes” Exploitation Exploration Isotropic Directional distribution distribution • The two modes complement each other • ATM uses a set of rules to control the amplitudes and interactions between the modes 12/7/2019 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 8 of 34

ATM Algorithm Regular Sample Best Sample Best Sample from last step Start from isotropic If sample leads distribution 1 2 to improvement: Suggest samples in that direction Repeat Once no more If new samples “good” samples also lead to are found: improvement: 4 3 Sample in same Start over with the direction at isotropic search exponentially (using an evolutionary increasing strategy) amplitude 12/7/2019 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 9 of 34

Parameters of the Algorithm There are (currently) 8 parameters which play several roles in the ATM algorithm: • Controlling the growth factors of the modes: 2 > 𝛦𝑌 𝑛𝑗𝑜 2 𝑗𝑔 𝑌 𝑐𝑓𝑡𝑢𝑢 − 𝑌 𝑐𝑓𝑡𝑢𝑢−1 : 𝑒𝑝 𝑒 += 1, 𝑠 = 0 𝑓𝑚𝑡𝑓: 𝑒𝑝 𝑠 += 1, 𝑒 = 0 • Controlling the amplitudes of the modes 𝜌𝑠 , 𝜌 𝑆 = 𝑆 𝑛𝑏𝑦 exp 𝐻 𝑠 sin 𝑛𝑝𝑒 − 1 2 𝑈 2 𝑠 𝐸 = 𝑆 𝑛𝑏𝑦 exp 𝐻 𝑒 𝑒 − 𝐸 𝑒 𝑠 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 10 of 34

Parameters of the Algorithm Controlling the search distribution in different axis: • 2 𝑷 − 𝑃 𝐻𝑐𝑓𝑡𝑢 𝑻 = 𝛾𝑻 + 1 − 𝛾 𝑛𝑓𝑏𝑜 𝒀 − 𝒀 𝐻𝑐𝑓𝑡𝑢𝑢 𝛽 𝑩 = 𝑻 + 𝛽 2 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 11 of 24

Online Parameter Tuning Changing How to do this? Different + characteristics at functions • 4 intertwined parameter sets different stages • Parameter sets are optimized by another Two-Mode algorithm Need for online parameter tuning • Objective function designed to reflect the “ success ” at the task of minimizing the true objective function 𝛦𝑃 𝑄𝑐𝑓𝑡𝑢 = 𝐶𝑓𝑡𝑢 𝑑ℎ𝑏𝑜𝑕𝑓 𝑗𝑜 𝑢ℎ𝑓 𝑃 𝑄 = (𝑛𝑓𝑏𝑜 Δ𝑃 𝑄𝑐𝑓𝑡𝑢 + 𝑛𝑗𝑜 Δ𝑃 𝑄𝑐𝑓𝑡𝑢 ) 𝑢𝑠𝑣𝑓 𝑝𝑐𝑘𝑓𝑑𝑢𝑗𝑤𝑓 𝑔𝑣𝑜𝑑𝑢𝑗𝑝𝑜, 𝑔𝑝𝑣𝑜𝑒 𝑐𝑧 𝑢ℎ𝑓 𝑞𝑏𝑠𝑏𝑛𝑢𝑓𝑠 𝑡𝑓𝑢 2 12/7/2019 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 13 of 34

Problem with Online Tuning Good chance New Changing for unsuitable parameter local search + sets sets space Fewer resources to “bad” parameter sets Proposal more resources to better ones Resources allocated to parameter set Performance of parameter set 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 14 of 34

Parallel Optimization with Resource Allocation  Given a fixed number of samples 𝑂 𝑢𝑝𝑢 , distributed among 𝑛 parameter sets.  Change the allocation of samples to reflect their performance 𝑶 𝑢+1 = 𝑶 𝑢 − 𝐿 𝑁 −1 𝜠𝑷 𝑸𝒄𝒇𝒕𝒖 𝒖 − K 0 M −1 𝐎 t − 𝐎 0 𝑶 𝒖 = 𝑆𝑓𝑡𝑝𝑣𝑠𝑑𝑓 𝑏𝑚𝑚𝑝𝑑𝑏𝑢𝑗𝑝𝑜 𝑤𝑓𝑑𝑢𝑝𝑠 𝑏𝑢 𝑗𝑢𝑓𝑠𝑏𝑢𝑗𝑝𝑜 𝑢 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 15 of 34

Parallel Optimization with Resource Allocation – Choice of Matrices 𝑶 𝑢+1 = 𝑶 𝑢 − 𝐿 𝑁 −1 𝜠𝑷 𝑸𝒄𝒇𝒕𝒖 𝒖 − K 0 M −1 𝐎 t − 𝐎 0 𝑛 − 1 ∗ 𝐿 −𝐿 ⋯ −𝐿 −𝐿 𝑛 − 1 ∗ 𝐿 ⋯ −𝐿 K = ⋮ ⋮ ⋱ ⋮ −𝐿 −𝐿 ⋯ 𝑛 − 1 ∗ 𝐿 𝐿 0 = 𝑙 0 𝐽 𝑁 = 𝜈 𝐽 • Conserves the total number of samples • Merit-based allocation system 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 16 of 34

Information Flow Throughout ATM Components Resource allocation Repeat Parameter Set1 Parameter Parameter Parameter Set4 Set2 Set3 Suggestions Evaluate for samples Samples Values of objective function 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 20 of 34

ATM Optimization Process Sum Of Different Powers - f14 Rotated Ellipse - f10 Sharp Ridge - f13 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 21 of 24

Results on BBOB Testbed - Overview Succeeds at One of the best • Underperforms on solving: optimizers for the non-separable functions • 23/24 in 2D separable functions subset 8/24 in 40D • Especially if ill-conditioned • (f1-5) and/or noisy +Large budget Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 24 of 34 4/1/2020

Results on BBOB Testbed - Successes • Very effective at optimizing separable functions • Capable at optimizing functions with “large” regions around the global minima which are convex (“large” = comparable to 𝑆 𝑛𝑏𝑦 ) 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 25 of 34

Results on BBOB Testbed Underperformance Poor performance on • rotated and ill conditioned functions Poor performance • rotated and noisy/ multimodal functions 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 26 of 34

Results from BBOB Largescale Budget = 3000D 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function Testbed 27 of 34

ATM Algorithm on the BBOB 2009 Noiseless Function Testbed Benjamin - PowerPoint PPT Presentation

Benchmarking The ATM Algorithm on the BBOB 2009 Noiseless Function Testbed Benjamin Bodner Brown University Providence, RI, USA BBOB Workshop GECCO 2019 Prague 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function

ATM* ATM Address Mapping ATM (Asynchronous Transf er Mode) is t he Rout er int erf ace (t

FIRSTCARD TM More Customers With More Cash CoreMark National ATM Program ATM Placement

First National ATM Your one-stop ATM Solutions partner Highest quality, lowest priced

Hometown Stockholm Focus on Air Traffic Management Si ATM own products Operational ATM systems

Tutorial on ATM & Automation Tutorial on ATM & Automation HALA! Position paper: Research

Development of the Asia/Pacific Regional ATM Contingency Plan Shane Sumner Regional Officer

The existing state of ATM accessibility in India Mixed methods studies to understand ATM usage

Stability functions inferred from Kansas experiment Atm S 547 Lecture 6, Slide 1 K m,h vs.

Wangara wind profiles showing log-layer Atm S 547 Lecture 5, Slide 1 Roughness length vs.

Atm ospheric Deposition, Atm ospheric Deposition, Transport, Transform ations , Transport,

ATM Networking: Issues and Challenges Ahead LAN or WAN Connectionless Low Speed or ATM or

Diebold Solutions Corporate and ATM security Todays Agenda Consumer Sensitive 1)

Cross-border Linkage of ATM Networks in East Asia 2008. 4 Jae Hyun Choi Director General

Group Security ATM Card Skimming and PIN capturing Awareness Guide Prepared by Simon Grubisic-

1 Virtual Channel Connection Uses Advantages of Virtual Paths (VP) & Virtual Channels (VC)

14/06/2013 ATM Harmonisation Felix Rodolfo Olivares ATS Adjoint Director General SENEAM

Information Architecture and Games Patricia Evans Overview IA in Video Games Menus

Teaching InChI to Chemistry Students Tina Qin Chemistry Librarian at Harvard University

HL7 Immunization User Group Monthly Meeting April 11, 2019 2:00 PM ET Agenda Welcome

Foundations for the Integration of Enterprise Wikis and Specialized Tools for Enterprise

Security and Usability from the Frontlines of Enterprise IT Jon Oberheide CTO, Duo Security

Spherical Harmonic Lighting Petras Sukys, Simon Ivarsson Chalmers University of Technology 2013

LaTeX Introduction to Latex Jimmy Broomfield University of Minnesota , November 7, 2016 1 / 22

Technology Presenta0on April 11, 2012 Why did we choose

ATM Algorithm on the BBOB 2009 Noiseless Function Testbed Benjamin - PowerPoint PPT Presentation

Benchmarking The ATM Algorithm on the BBOB 2009 Noiseless Function Testbed Benjamin Bodner Brown University Providence, RI, USA BBOB Workshop GECCO 2019 Prague 4/1/2020 Benchmarking the ATM Algorithm on the BBOB 2009 Noiseless Function

ATM* ATM Address Mapping ATM (Asynchronous Transf er Mode) is t he Rout er int erf ace (t

FIRSTCARD TM More Customers With More Cash CoreMark National ATM Program ATM Placement

First National ATM Your one-stop ATM Solutions partner Highest quality, lowest priced

Hometown Stockholm Focus on Air Traffic Management Si ATM own products Operational ATM systems

Tutorial on ATM &amp; Automation Tutorial on ATM &amp; Automation HALA! Position paper: Research

Development of the Asia/Pacific Regional ATM Contingency Plan Shane Sumner Regional Officer

The existing state of ATM accessibility in India Mixed methods studies to understand ATM usage

Stability functions inferred from Kansas experiment Atm S 547 Lecture 6, Slide 1 K m,h vs.

Wangara wind profiles showing log-layer Atm S 547 Lecture 5, Slide 1 Roughness length vs.

Atm ospheric Deposition, Atm ospheric Deposition, Transport, Transform ations , Transport,

ATM Networking: Issues and Challenges Ahead LAN or WAN Connectionless Low Speed or ATM or

Diebold Solutions Corporate and ATM security Todays Agenda Consumer Sensitive 1)

Cross-border Linkage of ATM Networks in East Asia 2008. 4 Jae Hyun Choi Director General

Group Security ATM Card Skimming and PIN capturing Awareness Guide Prepared by Simon Grubisic-

1 Virtual Channel Connection Uses Advantages of Virtual Paths (VP) &amp; Virtual Channels (VC)

14/06/2013 ATM Harmonisation Felix Rodolfo Olivares ATS Adjoint Director General SENEAM

Information Architecture and Games Patricia Evans Overview IA in Video Games Menus

Teaching InChI to Chemistry Students Tina Qin Chemistry Librarian at Harvard University

HL7 Immunization User Group Monthly Meeting April 11, 2019 2:00 PM ET Agenda Welcome

Foundations for the Integration of Enterprise Wikis and Specialized Tools for Enterprise

Security and Usability from the Frontlines of Enterprise IT Jon Oberheide CTO, Duo Security

Spherical Harmonic Lighting Petras Sukys, Simon Ivarsson Chalmers University of Technology 2013

LaTeX Introduction to Latex Jimmy Broomfield University of Minnesota , November 7, 2016 1 / 22

Technology Presenta0on April 11, 2012 Why did we choose

Tutorial on ATM & Automation Tutorial on ATM & Automation HALA! Position paper: Research

1 Virtual Channel Connection Uses Advantages of Virtual Paths (VP) & Virtual Channels (VC)