Adversarially Robust Optimization with Gaussian Processes Ilija - - PowerPoint PPT Presentation

adversarially robust optimization
SMART_READER_LITE
LIVE PREVIEW

Adversarially Robust Optimization with Gaussian Processes Ilija - - PowerPoint PPT Presentation

Adversarially Robust Optimization with Gaussian Processes Ilija Bogunovic, Jonathan Scarlett, Stefanie Jegelka, Volkan Cevher Conference on Neural Information Processing Systems (Dec 2018) Gaussian Process Optimization Optimum Non-robust


slide-1
SLIDE 1

Adversarially Robust Optimization with Gaussian Processes

Ilija Bogunovic, Jonathan Scarlett, Stefanie Jegelka, Volkan Cevher Conference on Neural Information Processing Systems (Dec 2018)

slide-2
SLIDE 2

Gaussian Process Optimization

x* = arg max

x∈D⊂ℝd f(x)

Setting:

  • Unknown utility function , modeled by Gaussian Process
  • Sequentially query the unknown function
  • Noisy and expensive point evaluations

f GP/Bayesian optimization

Optimum

Non-robust problem: f ∼ GP(μ, κ) f

slide-3
SLIDE 3

Adversarially Robust GP Optimization

Robust problem:

Optimum

x* = arg max

x∈D⊂ℝ

min

δ∈Δϵ(x)f(x+δ)

Set of input perturbations: Δϵ(x) = {x′− x : dist(x, x′) ≤ ϵ} Setting:

  • Unknown utility function , modeled by Gaussian Process
  • Sequentially query the unknown function
  • Noisy and expensive point evaluations

f f ∼ GP(μ, κ) f

slide-4
SLIDE 4

Adversarially Robust GP Optimization

Robust problem:

Optimum Non-Robust Optimum Robust Perturbed Function Original Function

x* = arg max

x∈D⊂ℝ

min

δ∈Δϵ(x)f(x+δ)

Set of input perturbations: Δϵ(x) = {x′− x : dist(x, x′) ≤ ϵ} Motivation: adversarial attack, implementation errors, etc. Setting:

  • Unknown utility function , modeled by Gaussian Process
  • Sequentially query the unknown function
  • Noisy and expensive point evaluations

f f ∼ GP(μ, κ) f

slide-5
SLIDE 5

Robust Algorithm: StableOpt

Non-robust BO methods:

Thompson [Thompson ’33] PI [Kushner’64] EI [Mockus et al.’78 ] GP-UCB [Srinivas et al.’11] ES [Henning et al.’12] GP-UCB-PE [Contal et al.’13] BamSOO [Wang et al.’14] PES [Hernandez-Lobato et al.’14] MRS [Metzen’16] GLASSES [Gonzalez et al.’15] OPES [Hoffman & Ghahramani’15] TruVaR [Bogunovic et al.'16] MES [Wang & Jegelka’17] FITBO [Ru et al.’18] KG [Wu et al.’17] the list goes on…

slide-6
SLIDE 6

Robust Algorithm: StableOpt

Robust algorithm: StableOpt

Round : t

˜ xt = argmax

x∈D

min

δ∈Δϵ(x) ucbt−1(x + δ)

Non-robust BO methods:

Thompson [Thompson ’33] PI [Kushner’64] EI [Mockus et al.’78 ] GP-UCB [Srinivas et al.’11] ES [Henning et al.’12] GP-UCB-PE [Contal et al.’13] BamSOO [Wang et al.’14] PES [Hernandez-Lobato et al.’14] MRS [Metzen’16] GLASSES [Gonzalez et al.’15] OPES [Hoffman & Ghahramani’15] TruVaR [Bogunovic et al.'16] MES [Wang & Jegelka’17] FITBO [Ru et al.’18] KG [Wu et al.’17] the list goes on…

  • Choose:
slide-7
SLIDE 7

Robust Algorithm: StableOpt

Robust algorithm: StableOpt

Round : t

˜ xt = argmax

x∈D

min

δ∈Δϵ(x) ucbt−1(x + δ)

δt = argmin

δ∈Δϵ(˜ xt)

lcbt−1(˜ xt + δ)

Non-robust BO methods:

Thompson [Thompson ’33] PI [Kushner’64] EI [Mockus et al.’78 ] GP-UCB [Srinivas et al.’11] ES [Henning et al.’12] GP-UCB-PE [Contal et al.’13] BamSOO [Wang et al.’14] PES [Hernandez-Lobato et al.’14] MRS [Metzen’16] GLASSES [Gonzalez et al.’15] OPES [Hoffman & Ghahramani’15] TruVaR [Bogunovic et al.'16] MES [Wang & Jegelka’17] FITBO [Ru et al.’18] KG [Wu et al.’17] the list goes on…

  • Choose:
  • Select:
slide-8
SLIDE 8

Robust Algorithm: StableOpt

Robust algorithm: StableOpt

Round : t

˜ xt = argmax

x∈D

min

δ∈Δϵ(x) ucbt−1(x + δ)

δt = argmin

δ∈Δϵ(˜ xt)

lcbt−1(˜ xt + δ) ˜ xt+δt

Non-robust BO methods:

Thompson [Thompson ’33] PI [Kushner’64] EI [Mockus et al.’78 ] GP-UCB [Srinivas et al.’11] ES [Henning et al.’12] GP-UCB-PE [Contal et al.’13] BamSOO [Wang et al.’14] PES [Hernandez-Lobato et al.’14] MRS [Metzen’16] GLASSES [Gonzalez et al.’15] OPES [Hoffman & Ghahramani’15] TruVaR [Bogunovic et al.'16] MES [Wang & Jegelka’17] FITBO [Ru et al.’18] KG [Wu et al.’17] the list goes on…

  • Choose:
  • Select:
  • Observe noisy function value at
slide-9
SLIDE 9

Theoretical Result

Theorem:

StableOpt guarantees that if then the reported point satisfies the following w.h.p.: where T ≳ γT η2 min

δ∈Δϵ(x(T)) f(x(T)+δ) ≥ max x∈D⊂ℝ min δ∈Δϵ(x) f(x + δ) − η,

: Total number of points queried : Target accuracy : Kernel-dependent information quantity η γT x(T) T

slide-10
SLIDE 10

Variations

Robustness to unknown parameters:

  • Goal: Choose robust to different ,
  • Application: Tuning hyperparameters robust to different data types

max

x∈D min θ∈Θ f(x, θ)

x θ

slide-11
SLIDE 11

Variations

Robustness to unknown parameters:

  • Goal: Choose robust to different ,
  • Application: Tuning hyperparameters robust to different data types

max

x∈D min θ∈Θ f(x, θ)

x θ Robust group identification: Input space is partitioned into groups G1 G2 Gk

  • Goal: Identify the group with the highest worst-case function value

max

G∈𝒣 min x∈G f(x)

  • Application: Robust group movie recommendation
slide-12
SLIDE 12

Adversarially Robust Optimization with Gaussian Processes

Poster #24

Wed Dec 5th 05:00 -- 07:00 PM @ Room 210 & 230 AB

Ilija Bogunovic, Jonathan Scarlett, Stefanie Jegelka, Volkan Cevher