Regret bounds for meta Bayesian optimization with an unknown Gaussian - - PowerPoint PPT Presentation

regret bounds for meta bayesian optimization with an
SMART_READER_LITE
LIVE PREVIEW

Regret bounds for meta Bayesian optimization with an unknown Gaussian - - PowerPoint PPT Presentation

Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior Zi Wang* Beomjoon Kim* Leslie Pack Kaelbling Dec 5 @ NeurIPS 18 Poster #22 Bayesian optimization x * = argmax Goal: f ( x ) x


slide-1
SLIDE 1

Zi Wang* Beomjoon Kim* Leslie Pack Kaelbling

Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior

Dec 5 @ NeurIPS 18

Poster #22

slide-2
SLIDE 2

Bayesian optimization

Goal:

x* = argmax

x∈𝔜

f(x)

Challenges:

  • f is expensive to evaluate
  • f is multi-peak
  • no gradient information
  • evaluations can be noisy
slide-3
SLIDE 3

Bayesian optimization

Goal:

x* = argmax

x∈𝔜

f(x)

Challenges:

  • f is expensive to evaluate
  • f is multi-peak
  • no gradient information
  • evaluations can be noisy

Assume a GP prior f ∼ GP(μ, k) LOOP

  • choose new query point(s) to evaluate
  • compute the posterior GP model
  • 3
  • 2
  • 1

1 2 3

  • 2
  • 1

1 2

f(x) x

slide-4
SLIDE 4

Bayesian optimization

Goal:

x* = argmax

x∈𝔜

f(x)

Challenges:

  • f is expensive to evaluate
  • f is multi-peak
  • no gradient information
  • evaluations can be noisy

Assume a GP prior f ∼ GP(μ, k) LOOP

  • choose new query point(s) to evaluate
  • compute the posterior GP model
  • 3
  • 2
  • 1

1 2 3

  • 2
  • 1

1 2

f(x) x

How to choose the prior?

slide-5
SLIDE 5

Bayesian optimization

Goal:

x* = argmax

x∈𝔜

f(x)

Assume a GP prior f ∼ GP(μ, k) LOOP

  • choose new query point(s) to evaluate
  • compute the posterior GP model
  • 3
  • 2
  • 1

1 2 3

  • 2
  • 1

1 2

f(x) x

  • re-estimate the prior parameters

Challenges:

  • f is expensive to evaluate
  • f is multi-peak
  • no gradient information
  • evaluations can be noisy

e.g. by maximizing marginal data likelihood every few iterations

slide-6
SLIDE 6

Bayesian optimization

Goal:

x* = argmax

x∈𝔜

f(x)

Assume a GP prior f ∼ GP(μ, k)

Which comes first? Data or prior? Challenges:

  • f is expensive to evaluate
  • f is multi-peak
  • no gradient information
  • evaluations can be noisy

LOOP

  • choose new query point(s) to evaluate
  • compute the posterior GP model
  • re-estimate the prior parameters

e.g. by maximizing marginal data likelihood every few iterations

slide-7
SLIDE 7

Bayesian optimization

Goal:

x* = argmax

x∈𝔜

f(x)

Assume a GP prior f ∼ GP(μ, k)

Hard to analyze. Which comes first? Data or prior? Challenges:

  • f is expensive to evaluate
  • f is multi-peak
  • no gradient information
  • evaluations can be noisy

LOOP

  • choose new query point(s) to evaluate
  • compute the posterior GP model
  • re-estimate the prior parameters

e.g. by maximizing marginal data likelihood every few iterations

slide-8
SLIDE 8

x

Bayesian optimization with an unknown GP prior

prior model data collected on f

slide-9
SLIDE 9

x

Bayesian optimization with an unknown GP prior

prior model data collected on f

Our problem setup: use past experience with similar functions as the meta training data to break the circular dependencies

slide-10
SLIDE 10

x

Meta Bayesian optimization with an unknown GP prior

Offline phase Online phase

slide-11
SLIDE 11

x x

ˆ µ(x) ˆ µ(x) ± 3 q ˆ k(x)

Meta Bayesian optimization with an unknown GP prior

Offline phase ̂ μ, ̂ k

Estimated prior

Online phase

Estimate the GP prior from offline data sampled from the same prior

slide-12
SLIDE 12

x

ˆ µ0(x) ˆ µ0(x) ± ζ1 q ˆ k0(x)

x

ˆ µ(x) ˆ µ(x) ± 3 q ˆ k(x)

Meta Bayesian optimization with an unknown GP prior

Offline phase ̂ μ, ̂ k

Estimated prior

Online phase

Estimate the GP prior from offline data sampled from the same prior Construct unbiased estimators of the posterior and use a variant of GP-UCB

slide-13
SLIDE 13

x

ˆ µ1(x) ˆ µ1(x) ± ζ2 q ˆ k1(x)

x

ˆ µ(x) ˆ µ(x) ± 3 q ˆ k(x)

Meta Bayesian optimization with an unknown GP prior

Offline phase ̂ μ, ̂ k

Estimated prior

Online phase

Estimate the GP prior from offline data sampled from the same prior Construct unbiased estimators of the posterior and use a variant of GP-UCB

slide-14
SLIDE 14

x

ˆ µ2(x) ˆ µ2(x) ± ζ3 q ˆ k2(x)

x

ˆ µ(x) ˆ µ(x) ± 3 q ˆ k(x)

Meta Bayesian optimization with an unknown GP prior

Offline phase ̂ μ, ̂ k

Estimated prior

Online phase

Estimate the GP prior from offline data sampled from the same prior Construct unbiased estimators of the posterior and use a variant of GP-UCB

slide-15
SLIDE 15

x

ˆ µ3(x) ˆ µ3(x) ± ζ4 q ˆ k3(x)

x

ˆ µ(x) ˆ µ(x) ± 3 q ˆ k(x)

Meta Bayesian optimization with an unknown GP prior

Offline phase ̂ μ, ̂ k

Estimated prior

Online phase

Estimate the GP prior from offline data sampled from the same prior Construct unbiased estimators of the posterior and use a variant of GP-UCB

slide-16
SLIDE 16

x

ˆ µ4(x) ˆ µ4(x) ± ζ5 q ˆ k4(x)

x

ˆ µ(x) ˆ µ(x) ± 3 q ˆ k(x)

Meta Bayesian optimization with an unknown GP prior

Offline phase ̂ μ, ̂ k

Estimated prior

Online phase

Estimate the GP prior from offline data sampled from the same prior Construct unbiased estimators of the posterior and use a variant of GP-UCB

slide-17
SLIDE 17

Effect of N, the number of meta training functions

0.00 0.25 0.50 0.75 1.00 x −10 −5 5 10

ˆ µt(x) ˆ µt(x) ± ζt+1 q ˆ kt(x)

0.00 0.25 0.50 0.75 1.00 x −10 −5 5 10

ˆ µt(x) ˆ µt(x) ± ζt+1 q ˆ kt(x)

N = 1000 N = 100

x x

slide-18
SLIDE 18

Important assumptions:

  • meta-training functions come from the same prior
  • enough number of meta-training functions N ≳ T + 20

constant

  • bservation noise

Theorem (finite input space)

Given , with high probability, simple regret RT ≲

O ( 1 N − T ) + C O ( log T T ) + σ2 → Cσ T f

  • bservations on the test function

linear kernel

Bounding the regret of meta BO with an unknown GP prior

≈ 10

Results for continuous input space @ poster #22

slide-19
SLIDE 19

Empirical results on block picking and placing

……

f1 f2 fN f

meta-training data test function

N = 1500

#evaluations of test function

Max observed value

—Our method —UCB —TransLearn —Rand

5 10 15 20 25 30 6 5 4 3 2

proportion of meta-training data

—Our method —UCB

0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 6 5 4 3 2

slide-20
SLIDE 20

Poster #22

More results on:

  • estimation details for discrete and continuous input spaces
  • regret bounds for compact input space in
  • regret bounds for probability of improvement in the meta learning setting
  • empirical results on robotics tasks

Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior

Rd https://ziw.mit.edu/meta_bo