bayesian optimization of gaussian processes applied to
play

Bayesian Optimization of Gaussian Processes applied to Performance - PowerPoint PPT Presentation

Bayesian Optimization of Gaussian Processes applied to Performance Tuning Ramki Ramakrishna @ysr1729 #TwitterVMTeam QCon Sao Paulo, 2019 A JVM Engineer talks to a Data Scientist 2 Many Hundreds of Services Several Tens of Thousands


  1. Bayesian Optimization of Gaussian Processes applied to Performance Tuning Ramki Ramakrishna @ysr1729 #TwitterVMTeam QCon Sao Paulo, 2019

  2. A JVM Engineer talks to a Data Scientist � 2

  3. Many Hundreds of Services

  4. Several Tens of Thousands of Physical Servers

  5. Several Millions of CPU Cores

  6. Several Hundreds of Thousands of Twitter JVMs

  7. A Few Hundred Tunable JVM Parameters

  8. Mining for Gold • 1930’s South Africa • Prospecting for gold and other minerals • Daniel Krige, 1951: “Kriging” in geostatistics • Jonas Mockus 70’s • Jones et al. 80’s • Rasmussen & Williams 90’s : Gaussian Processes � 12

  9. Applications • design of expensive experiments • optimal designs • optimization of engineered materials • hyperparameter tuning (architectural parameters) of neural networks � 13

  10. Engineering as Optimization • linear or non-linear objective function • finite convex or non-convex space, rectangular, linear (a ffi ne) or non-linear constraints • black box objective function • black box constraints • noisy objective function • noisy constraints � 14

  11. Black Box Modeling • Model the unknown objective function • Model the unknown constraints • Model is a “surrogate” • Evaluations are expensive � 15

  12. Models and Model Parameters • Parametric models • Non-parametric models � 16

  13. Probabilistic Models • A measure of our uncertainty • A measure of measurement/observation noise � 17

  14. Gaussian Process GP ( μ , κ ) • : mean function μ ( x ) κ ( x , x ′ � ) • : covariance function � 18

  15. Gaussian Process • Two di ff erent views • a vector of possibly uncountably many Gaussian variables with given mean and a joint covariate distribution • a Gaussian distribution over functions � 19

  16. Gaussian Process � 20

  17. Gaussian Process � 21

  18. Gaussian Process � 22

  19. Gaussian Process � 23

  20. Gaussian Process � 24

  21. Gaussian Process � 25

  22. Gaussian Process � 26

  23. Gaussian Process � 27

  24. GP n μ n ( x ) = κ T ( K + σ 2 noise I ) − 1 Y κ n ( x , x ′ � ) = κ ( x , x ′ � ) − κ T ( K + σ 2 noise I ) − 1 κ ′ � � 28

  25. Covariance Kernel Function • Squared exponentials (SE) • “n/2" Matern kernels � 29

  26. Covariance Kernel Functions � 30

  27. Covariance Kernel Functions � 31

  28. Acquisition Function GP prior + Data n → Bayes GP n → ? x n +1 � 32

  29. Acquisition Function GP prior + Data n +1 → Bayes GP n +1 → ? x n +2 � 33

  30. Acquisition Functions • Thompson Sampling from the posterior GP (TS) • Probability of Improvement (PI) • Upper Confidence Bound (UCB) • Expected Improvement (EI) � 34

  31. Thompson Sampling � 35

  32. Probability of Improvement � 36

  33. Upper Confidence Bound � 37

  34. Expected Improvement � 38

  35. Acquisition Function • Thompson Sampling from the posterior GP (TS) • Probability of Improvement (PI) • Upper Confidence Bound (UCB) • Expected Improvement (EI) � 39

  36. Maximizing the Acquisition Function • piecewise infinitely smooth • gradient-based techniques work • modified Monte-Carlo techniques are typically used � 40

  37. Optimizing Performance Parameters myExpt = Optimizer.declareDevice(Parm1: {Int, Min1, Max1}, Parm2: {Real, Min2, Max2}, Parm3: {Enum, enum1, enum2, enum3} …) myExpt.setSLA(…) // set performance SLA myExpt.setTerminationCriteria(…) // set termination criteria while (!myExpt.shouldTerminate()) { parmSuggestion = myExpt.suggest() // get another test suggestion newRun = myDevice.test(parmSuggestion) // test device at given setting if (myExpt.isValid(newRun)) { // is SLA met? myExpt.update(parmSuggestion, newRun) // update w/new result } } return myExpt.bestConfig()

  38. Bayesian Optimization myExpt = Optimizer.declareDevice(Parm1: {Int, Min1, Max1}, Parm2: {Real, Min2, Max2}, Parm3: {Enum, enum1, enum2, enum3} …) myExpt.setSLA(…) // set performance SLA myExpt.setTerminationCriteria(…) // set termination criteria while (!myExpt.shouldTerminate()) { parmSuggestion = myExpt.suggest() // get another test suggestion newRun = myDevice.test(parmSuggestion) // test device at given setting if (myExpt.isValid(newRun)) { // is SLA met? myExpt.update(parmSuggestion, newRun) // update w/new result } } return myExpt.bestConfig()

  39. Constraints myExpt = Optimizer.declareDevice(Parm1: {Int, Min1, Max1}, Parm2: {Real, Min2, Max2}, Parm3: {Enum, enum1, enum2, enum3} …) myExpt.setSLA(…) // set performance SLA myExpt.setTerminationCriteria(…) // set termination criteria while (!myExpt.shouldTerminate()) { parmSuggestion = myExpt.suggest() // get another test suggestion newRun = myDevice.test(parmSuggestion) // test device at given setting if (myExpt.isValid(newRun)) { // is SLA met? myExpt.update(parmSuggestion, newRun) // update w/new result } } return myExpt.bestConfig()

  40. AUTOTUNE AS A SERVICE

  41. GizmoDuck & Garbage Collection Overhead via Tuning JVM Parameters

  42. TweetyPie & CPU Utilization via Tuning Graal JIT Parameters

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend