Knowing The What, But Not The Where in Bayesian Optimization Vu - - PowerPoint PPT Presentation

knowing the what but not the where in bayesian
SMART_READER_LITE
LIVE PREVIEW

Knowing The What, But Not The Where in Bayesian Optimization Vu - - PowerPoint PPT Presentation

Knowing The What, But Not The Where in Bayesian Optimization Vu Nguyen & Michael A. Osborne University of Oxford Vu Nguyen Bayesian Optimization 1 Black-box Optimization The relationship from to is through the black-box. Output


slide-1
SLIDE 1

Vu Nguyen

Knowing The What, But Not The Where in Bayesian Optimization

Vu Nguyen & Michael A. Osborne University of Oxford

1 Bayesian Optimization

slide-2
SLIDE 2

Vu Nguyen

The relationship from to is through the black-box.

Black-box Optimization

Bayesian Optimization 2

Input Output = ()

Black-box ()

Input Output ()

  • ()
  • ()
  • ()
  • looking for this maximizer
slide-3
SLIDE 3

Vu Nguyen

Properties of Black-box Function

: ∈ ℛ → ∈ ℛ

Bayesian Optimization 3

  • = ()

Function form is not known = + No derivative form

  • = ⋯

Expensive to evaluate (in time and cost) Nothing is known about the function, except a few evaluations = ()

input

  • utput

()

slide-4
SLIDE 4

Vu Nguyen

Bayesian Optimization Overview

Bayesian Optimization 4

Bayes Opt input

  • utput

Refine ()

Acquisition function

exploit explore

() = () + × ()

Surrogate function

predictive mean predictive variance

() ()

Make a series of evaluations , , …

slide-5
SLIDE 5

Vu Nguyen

Bayesian Optimization Bayes Opt with Known Optimum Value

Outline

Knowing the what, but not the where in Bayes Opt 5

slide-6
SLIDE 6

Vu Nguyen

We consider situations where the optimum value is known. ∗ = max () and the goal is to find ∗ = arg max ().

Knowing Optimum Value of The Black-Box

Knowing the what, but not the where in Bayes Opt 6

slide-7
SLIDE 7

Vu Nguyen

Deep reinforcement learning:

CartPole: 200 Pong: 18 Frozen Lake: 0.79 ± 0.05 InvertedPendulum: 950

Classification:

Skin dataset: Accuracy 100

Inverse optimization:

Given a database and a target property , identifying a corresponding data point ∗.

Examples of Knowing Optimal Value of The Black-Box

Knowing the what, but not the where in Bayes Opt 7

slide-8
SLIDE 8

Vu Nguyen

1.

∗ tells us about the upper bound: ∗ ≥ , ∀

2.

∗ tells us that the function is reaching ∗ at some points.

What can ∗ tell us about ?

Knowing the what, but not the where in Bayes Opt 8

1 2

slide-9
SLIDE 9

Vu Nguyen

Transformed Gaussian process

Knowing the what, but not the where in Bayes Opt 9

This condition ensures that ∗ ≥ , ∀

∼ ( 2∗, ) = ∗ − 1 2 ()

≥ 0 1

slide-10
SLIDE 10

Vu Nguyen

Push down: the surrogate must not go above ∗

We want to control the surrogate using ∗

Knowing the what, but not the where in Bayes Opt 10

below ∗ () is above ∗ standard GP 1 transformed GP

slide-11
SLIDE 11

Vu Nguyen

= ∗ −

()

This condition encourages that there is a point where = 0 and thus ∗ =

Transformed Gaussian process

Knowing the what, but not the where in Bayes Opt 11

∼ (0, )

Zero mean prior !

≥ 0 2

slide-12
SLIDE 12

Vu Nguyen

Lift up: the surrogate should reach ∗

We want to control the surrogate using ∗

Knowing the what, but not the where in Bayes Opt 12

reach ∗ () does not reach ∗

2 standard GP transformed GP

slide-13
SLIDE 13

Vu Nguyen

Linearization using Taylor expansion ≈ ∗ − 1 2

− = ∗ + 1 2

Linear transformation of a GP remains Gaussian = ∗ − 1 2

()

=

()

The predictive distribution ∼ ( , ()) Taylor expansion is very accurate at the mode which is

()

Transformed Gaussian process

Knowing the what, but not the where in Bayes Opt 13

slide-14
SLIDE 14

Vu Nguyen

Bayesian Optimization Bayes Opt with Known Optimum Value ∗

Problem definition Exploiting ∗

Building better surrogate model Making informed decision

Outline

Knowing the what, but not the where in Bayes Opt 14

slide-15
SLIDE 15

Vu Nguyen

Under GP surrogate model, we have this condition w.h.p. where is defined following [Srinivas et al 2010]. This means ∗ − ∗ ≤ ∗ = ∗ ≤ ∗ + ∗

Confidence Bound Minimization

Knowing the what, but not the where in Bayes Opt 15

Lower bound

Upper bound

known unknown can be estimated ∀ Upper bound Lower bound

slide-16
SLIDE 16

Vu Nguyen

The best candidate for ∗ is where the bound is tight

= arg min − ∗ +

The inequality becomes equality at the true ∗ location where

∗ − ∗ = ∗ = ∗ + ∗

when ∗ = ∗ and ∗ = 0

Confidence Bound Minimization

Knowing the what, but not the where in Bayes Opt 16

Lower bound

Upper bound

known Upper bound Lower bound

slide-17
SLIDE 17

Vu Nguyen

Regret = ∗ − () where ∗ = max , ∀ Finding the optimum location ∗ = minimizing the regret. We can select the next point by minimizing the expected regret.

Expected Regret Minimization

Knowing the what, but not the where in Bayes Opt 17

slide-18
SLIDE 18

Vu Nguyen

Using analytical derivation, we derive the closed-form computation for ERM.

∗ = × + ∗ − × Φ =

See the paper for details!

Expected Regret Minimization

Knowing the what, but not the where in Bayes Opt 18

Gaussian PDF Gaussian CDF GP variance GP mean

slide-19
SLIDE 19

Vu Nguyen

Illustration

Knowing the what, but not the where in Bayes Opt 19

Existing Baselines The Proposed Correctly identify the true (unknown) location Tend to explore elsewhere

slide-20
SLIDE 20

Vu Nguyen

The GP transformation is helpful in high dimension

Knowing the what, but not the where in Bayes Opt 20

slide-21
SLIDE 21

Vu Nguyen

Skin dataset UCI ∗ = 100 CartPole DRL ∗ = 200

XGBoost Classification and DRL

Knowing the what, but not the where in Bayes Opt 21

slide-22
SLIDE 22

Vu Nguyen

Under-specified ∗ smaller than the true ∗

More serious, as the algorithm will get stuck.

Over-specified ∗ greater than the true ∗

Less serious, but still poor performance.

Mis-specified ∗ will degrade the performance

Knowing the what, but not the where in Bayes Opt 22

slide-23
SLIDE 23

Vu Nguyen

Bayes opt is efficient for optimizing the black-box function When the optimum value is known, we can exploit this knowledge for better optimization.

Take Home Messages

Knowing the what, but not the where in Bayes Opt 23

slide-24
SLIDE 24

Vu Nguyen

Question and Answer

Conclusion 24

vu@robots.ox.ac.uk @nguyentienvu https://ntienvu.github.io