comparison of ordinal and metric gaussian process
play

Comparison of Ordinal and Metric Gaussian Process Regression as - PowerPoint PPT Presentation

DTS-CMA-ES Surrogate models Experimental results Comparison of Ordinal and Metric Gaussian Process Regression as Surrogate Models for CMA Evolution Strategy ek Pitra 1 , 2 , 3 , Luk Bajer 1 , 4 , Jakub Repick 1 , 4 , Zbyn na 1 Martin


  1. DTS-CMA-ES Surrogate models Experimental results Comparison of Ordinal and Metric Gaussian Process Regression as Surrogate Models for CMA Evolution Strategy ek Pitra 1 , 2 , 3 , Lukáš Bajer 1 , 4 , Jakub Repický 1 , 4 , Zbynˇ na 1 Martin Holeˇ 1 Institute of Computer Science, Czech Academy of Sciences 2 Faculty of Nuclear Sciences and Physical Engineering 3 National Institute of Mental Health 4 Faculty of Mathematics and Physics, Charles University Prague, Czech Republic GECCO 2017 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 1

  2. DTS-CMA-ES Surrogate models Experimental results Contents DTS-CMA-ES 1 Surrogate models 2 Metric Gaussian Processes Ordinal Gaussian Processes Experimental results 3 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 2

  3. DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 CMA-ES m 1 , σ 1 sampling from N ( m 1 , σ 1 ) 1 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

  4. DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 m 1 , σ 1 1 st model training 2 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

  5. DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 s 2 get mean ˆ µ i and variance ˆ i of all x i with the model f M 1 3 s 2 m 1 , σ 1 distribution prediction 3 according to 1 st model Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

  6. DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 s 2 get mean ˆ µ i and variance ˆ i of all x i with the model f M 1 3 select the most promising ⌈ αλ ⌉ points accord. to the model f M 1 4 s 2 3rd 3rd m 1 , σ 1 1st 1st 2nd 2nd criterion ranking 4 according to 1 st model Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

  7. DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 s 2 get mean ˆ µ i and variance ˆ i of all x i with the model f M 1 3 select the most promising ⌈ αλ ⌉ points accord. to the model f M 1 4 evaluate the chosen points 5 with the original fitness f fitness evaluation m 1 , σ 1 of a few chosen points 5 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

  8. DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 s 2 get mean ˆ µ i and variance ˆ i of all x i with the model f M 1 3 select the most promising ⌈ αλ ⌉ points accord. to the model f M 1 4 evaluate the chosen points 5 with the original fitness f re-train the second model f M 2 6 with these new points 2 nd model m 1 , σ 1 training 6 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

  9. DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 s 2 get mean ˆ µ i and variance ˆ i of all x i with the model f M 1 3 select the most promising ⌈ αλ ⌉ points accord. to the model f M 1 4 evaluate the chosen points 5 with the original fitness f re-train the second model f M 2 6 2 nd model with these new points mean-prediction m 1 , σ 1 for the rest of predict the fitness for the 7 population non-original-evaluated points 7 with f M 2 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

  10. DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 s 2 get mean ˆ µ i and variance ˆ i of all x i with the model f M 1 3 select the most promising ⌈ αλ ⌉ points accord. to the model f M 1 4 evaluate the chosen points 5 with the original fitness f re-train the second model f M 2 6 with these new points m, σ, C CMA-ES m 2 , σ 2 predict the fitness for the 7 update non-original-evaluated points 8 with f M 2 CMA-ES update of m , σ , C 8 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

  11. DTS-CMA-ES Metric Gaussian Processes Surrogate models Ordinal Gaussian Processes Experimental results Gaussian Process GP is a stochastic process, where any finite collection of random variables has a joint Gaussian distribution f GP ( x ) ∼ GP ( µ ( x ) , k ( x 1 , x 2 )) Defined by the mean function µ ( x ) (usually constant) and covariance function k ( x 1 , x 2 ) and their (hyper)parameters Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 4

  12. DTS-CMA-ES Metric Gaussian Processes Surrogate models Ordinal Gaussian Processes Experimental results Gaussian Process GP is a stochastic process, where any finite collection of random variables has a joint Gaussian distribution f GP ( x ) ∼ GP ( µ ( x ) , k ( x 1 , x 2 )) Defined by the mean function µ ( x ) (usually constant) and covariance function k ( x 1 , x 2 ) and their (hyper)parameters GP can express uncertainty of the prediction in a new point x : it gives a probability distribution of the output value Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 4

  13. DTS-CMA-ES Metric Gaussian Processes Surrogate models Ordinal Gaussian Processes Experimental results Gaussian Process given a set of N training points X N = ( x 1 . . . x N ) , x i ∈ R d , and corresponding measured values y N = ( y 1 , . . . , y N ) ⊤ of a function f being approximated y i = f ( x i ) , i = 1 , . . . , N Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 5

  14. DTS-CMA-ES Metric Gaussian Processes Surrogate models Ordinal Gaussian Processes Experimental results Gaussian Process given a set of N training points X N = ( x 1 . . . x N ) , x i ∈ R d , and corresponding measured values y N = ( y 1 , . . . , y N ) ⊤ of a function f being approximated y i = f ( x i ) , i = 1 , . . . , N GP considers vector of these function values as a sample from N -variate Gaussian distribution y N ∼ N ( 0 , C N ) Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 5

  15. DTS-CMA-ES Metric Gaussian Processes Surrogate models Ordinal Gaussian Processes Experimental results Gaussian Process prediction When considering a new point ( x ∗ , y ∗ ) , the prob. density of its f -values is 1D Gaussian p ( y ∗ | X N , x ∗ , y N ) ∼ N (ˆ s 2 N + 1 ) µ N + 1 , ˆ Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 6

  16. DTS-CMA-ES Metric Gaussian Processes Surrogate models Ordinal Gaussian Processes Experimental results Gaussian Process prediction When considering a new point ( x ∗ , y ∗ ) , the prob. density of its f -values is 1D Gaussian p ( y ∗ | X N , x ∗ , y N ) ∼ N (ˆ s 2 N + 1 ) µ N + 1 , ˆ with the mean and variance given by k ⊤ C N − 1 y N , ˆ = µ N + 1 s 2 N + 1 κ − k ⊤ C N − 1 k = where C N is GP covariance matrix – matrix of covariance function’s values k ( x i , x j ) for each pair x i , x j k is vector of covariance function’s values k ( x ∗ , x i ) between the new point x ∗ and x i ∈ X N κ is the variance of the new point itself k ( x ∗ , x ∗ ) Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend