Safe Exploration for Optimization with Gaussian Processes
Yanan Sui
Caltech
Alkis Gotovos
ETH Zurich
Joel W. Burdick
Caltech
Andreas Krause
ETH Zurich
Safe Exploration for Optimization with Gaussian Processes Yanan Sui - - PowerPoint PPT Presentation
Safe Exploration for Optimization with Gaussian Processes Yanan Sui Alkis Gotovos Joel W. Burdick Andreas Krause Caltech ETH Zurich Caltech ETH Zurich Better safe than sorry youtube.com/user/mattessons Safe Exploration for Optimization
Caltech
ETH Zurich
Caltech
ETH Zurich
youtube.com/user/mattessons
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 2
girardgibbs.com sjm.com
◮ Find electrode confjgurations that
◮ Bad confjgurations may cause pain or have
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 3
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 4
◮ Finite decision set D ◮ Unknown reward function f : D → R
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 5
◮ Finite decision set D ◮ Unknown reward function f : D → R ◮ Safety threshold h ∈ R
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 6
◮ Finite decision set D ◮ Unknown reward function f : D → R ◮ Safety threshold h ∈ R
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 7
◮ Finite decision set D ◮ Unknown reward function f : D → R ◮ Safety threshold h ∈ R ◮ Seed set S0 of safe decisions (∀x ∈ S0, f(x) ≥ h)
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 8
◮ For t = 1, 2, . . .
◮ select xt ∈ D ◮ observe f(xt) + nt
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 9
◮ For t = 1, 2, . . .
◮ select xt ∈ D ◮ observe f(xt) + nt
◮ Find x∗ ∈ argmaxx∈D f(x) ◮ Remain safe: ∀t ≥ 1, f(xt) ≥ h
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 10
◮ For t = 1, 2, . . .
◮ select xt ∈ D ◮ observe f(xt) + nt
◮ Find x∗ ∈ argmaxx∈D f(x) ◮
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 11
◮ Bayesian optimization: function evaluation is expensive
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 12
◮ Bayesian optimization: function evaluation is expensive ◮ Various proposed criteria, e.g.,
◮ Expected improvement [Mockus et al., 1974] ◮ UCB [Auer, 2002] [Srinivas et al., 2010]
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 13
◮ Bayesian optimization: function evaluation is expensive ◮ Various proposed criteria, e.g.,
◮ Expected improvement [Mockus et al., 1974] ◮ UCB [Auer, 2002] [Srinivas et al., 2010]
◮ Related variants
◮ Level set estimation [Gotovos et al., 2013] ◮ Bayesian optimization with constraints [Gardner et al., 2014]
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 14
◮ Bayesian optimization: function evaluation is expensive ◮ Various proposed criteria, e.g.,
◮ Expected improvement [Mockus et al., 1974] ◮ UCB [Auer, 2002] [Srinivas et al., 2010]
◮ Related variants
◮ Level set estimation [Gotovos et al., 2013] ◮ Bayesian optimization with constraints [Gardner et al., 2014]
◮ Gaussian processes popular for modeling the unknown function
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 15
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 16
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 17
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 18
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 19
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 20
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 21
◮ Use upper confjdence bounds for optimistic sampling
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 22
◮ Use upper confjdence bounds for optimistic sampling ◮ xt = argmaxx∈D ut(x)
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 23
◮ Use upper confjdence bounds for optimistic sampling ◮ xt = argmaxx∈D ut(x) ◮ Sublinear regret under suitable conditions on f [Srinivas et al., 2010]
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 24
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 25
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 26
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 27
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 28
◮ Assume that f is L-Lipschitz continuous w.r.t. a metric d
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 29
◮ Assume that f is L-Lipschitz continuous w.r.t. a metric d ◮ If for some safe x we know f(x), then a safety certifjcate for x′ is
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 30
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 31
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 32
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 33
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 34
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 35
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 36
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 37
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 38
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 39
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 40
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 41
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 42
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 43
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 44
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 45
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 46
◮ Initial goal of fjnding f ∗ = maxx∈D f(x) is unrealistic
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 47
◮ Initial goal of fjnding f ∗ = maxx∈D f(x) is unrealistic ◮ Instead, aim for the ϵ-reachable maximum
ϵ =
x∈ ¯ Rϵ(S0) f(x)
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 48
◮ Initial goal of fjnding f ∗ = maxx∈D f(x) is unrealistic ◮ Instead, aim for the ϵ-reachable maximum
ϵ =
x∈ ¯ Rϵ(S0) f(x) ◮ Smaller ϵ → stricter goal → need more samples
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 49
◮ Keep set St of certifjed safe points (starting with S0)
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 50
◮ Keep set St of certifjed safe points (starting with S0) ◮ Use Lipschitz property with GP lower bounds to certify safety
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 51
◮ Keep set St of certifjed safe points (starting with S0) ◮ Use Lipschitz property with GP lower bounds to certify safety ◮ xt = argmaxx∈St ut(x)
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 52
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 53
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 54
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 55
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 56
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 57
◮ Encourage expansion of St → keep set Gt ⊆ St of potential expanders
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 58
◮ Encourage expansion of St → keep set Gt ⊆ St of potential expanders ◮ Encourage locating the maximum within St → keep set Mt ⊆ St of potential
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 59
◮ Encourage expansion of St → keep set Gt ⊆ St of potential expanders ◮ Encourage locating the maximum within St → keep set Mt ⊆ St of potential
◮ Pick most uncertain point within Gt ∪ Mt
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 60
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 61
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 62
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 63
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 64
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 65
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 66
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 67
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 68
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 69
◮ f has bounded norm in the RKHS defjned by k ◮ f is L-Lipschitz continuous ◮ nt is a uniformly bounded martingale difgerence sequence
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 70
◮ f has bounded norm in the RKHS defjned by k ◮ f is L-Lipschitz continuous ◮ nt is a uniformly bounded martingale difgerence sequence
◮ ∀t ≥ 1, f(xt) ≥ h ◮ ∀t ≥ t∗, f(ˆ
ϵ − ϵ
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 71
◮ Draw 100 random 2-D functions from GP prior (sq. exponential kernel) ◮ Use random singleton seed set S0 per function ◮ Run 100 iterations of each algorithm
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 72
ϵ − maxτ≤t f(xτ)
10 20 30 40 50 60 70 1 2 3 Safe-UCB SafeOpt GP-UCB
SafeOpt Safe-UCB GP-UCB
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 73
ϵ − maxτ≤t f(xτ)
10 20 30 40 50 60 70 1 2 3 Safe-UCB SafeOpt GP-UCB
1.5 2 2.5 SafeOpt 1.5 2 2.5 Safe-UCB 1.5 2 2.5 GP-UCB
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 74
◮ Electrode confjgurations are
◮ Fit sq. exponential ARD kernel ◮ Run 300 iterations of each
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 75
SafeOpt Safe-UCB GP-UCB 25 50 75 100 stop SafeOpt Safe-UCB GP-UCB non-stop SafeOpt Safe-UCB GP-UCB
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 76
SafeOpt Safe-UCB GP-UCB 25 50 75 100 stop SafeOpt Safe-UCB GP-UCB 25 50 75 100 non-stop SafeOpt Safe-UCB GP-UCB
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 77
SafeOpt Safe-UCB GP-UCB 25 50 75 100 stop SafeOpt Safe-UCB GP-UCB 25 50 75 100 non-stop 0.4 0.6 0.8 1 SafeOpt 0.4 0.6 0.8 1 Safe-UCB 0.4 0.6 0.8 1 GP-UCB
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 78
◮ We formulated safe optimization using the concept of reachability ◮ We proposed SafeOpt, an algorithm with theoretical guarantees
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 79
◮ We formulated safe optimization using the concept of reachability ◮ We proposed SafeOpt, an algorithm with theoretical guarantees
◮ Rigorous theoretical setup and analysis ◮ Another application: safe movie recommendation
Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 80