Scalable Gaussian Processes
Zhenwen Dai
Amazon
9 September 2019 @GPSS 2019
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 1 / 46
Scalable Gaussian Processes Zhenwen Dai Amazon 9 September 2019 - - PowerPoint PPT Presentation
Scalable Gaussian Processes Zhenwen Dai Amazon 9 September 2019 @GPSS 2019 Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 1 / 46 Gaussian process Input and Output Data: X = ( x 1 , . . . , x N ) y = ( y 1 ,
Amazon
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 1 / 46
0.4 0.5 0.6 0.7 0.8 0.9 1.0 −6 −4 −2 2 4 6 8 10 Mean Data Confidence
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 2 / 46
θ
θ
∗
Scalable Gaussian Processes 9 September 2019 @GPSS 2019 3 / 46
2l2(xi − xj)⊤(xi − xj)
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 4 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 5 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 6 / 46
500 1000 1500 2000 2500 data size (N) 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 time (second) Mean Data Confidence Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 7 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 8 / 46
computer speed
◮ After 10 years, it will take about 176 days. ◮ After 50 years, it will take about 2.9 hours.
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 9 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 10 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 11 / 46
−0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 −10 −5 5 10 15 20 Mean Data Confidence
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 12 / 46
−0.25 0.00 0.25 0.50 0.75 1.00 1.25 −10 −5 5 10 15 20 Mean Data Confidence
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 13 / 46
20 40 60 80 100 200 400 600 800 1000 1200 1400
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 14 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 15 / 46
zz K⊤ z , where Kz = K(X, Z) and Kzz = K(Z, Z).
zz K⊤ z + σ2I
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 16 / 46
zz K⊤ z + σ2I)−1 = σ−2I − σ−4Kz(Kzz + σ−2K⊤ z Kz)−1K⊤ z
z Kz) ∈ RM×M.
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 17 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 18 / 46
fu
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 19 / 46
uuu, Kff − KfuK−1 uuK⊤ fu + σ2I
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 20 / 46
uuK⊤ fu + σ2I.
uuu, Λ + σ2I
uuK⊤ fu) ◦ I.
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 21 / 46
uuK⊤ fu + Λ + σ2I
zz K⊤ z + Λ + σ2I)−1 = A − AKz(Kzz + K⊤ z AKz)−1K⊤ z A,
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 22 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 23 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 24 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 25 / 46
uuu, Kff − KfuK−1 uuK⊤ fu + σ2I
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 26 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 27 / 46
uuu, σ2I
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 28 / 46
uuu − y)⊤(KfuK−1 uuu − y)
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 29 / 46
µ,Σ
uuK⊤ fu + σ2I
uuK⊤ fu
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 30 / 46
0.4 0.5 0.6 0.7 0.8 0.9 1.0 −6 −4 −2 2 4 6 8 10 Mean Inducing Data Confidence
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 31 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 32 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 33 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 34 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 35 / 46
2 or known as exponential kernel:
l2
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 36 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 37 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 38 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 39 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 40 / 46
c=1 Dc.
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 41 / 46
N
C
1
nc∈Dc ||ync − fθ(xnc)||2.
2
3
c=1 lc and
c=1 ∂lc/∂θ.
4
5
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 42 / 46
fuy −
uuΦ
fuKfu and φ = tr (Kff).
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 43 / 46
N
n,
N
N
fnuKfnu
N
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 44 / 46
1
c yc, y⊤ c Kfcu, Φc and φc.
2
3
4
5
6
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 45 / 46
10000 20000 30000 40000 50000 60000 70000 number of datapoints 5 10 15 20 25 30 35 40 average time per iteration (seconds)
1 CPUs 2 CPUs 4 CPUs 8 CPUs 16 CPUs 32 CPUs 1 GPUs 2 GPUs 4 GPUs
10000 20000 30000 40000 50000 60000 70000 number of datapoints 0.0% 5.0% 10.0% 15.0% 20.0% 25.0% percentage of indistributable computational time
1 cpu cores 2 cpu cores 4 cpu cores 8 cpu cores 16 cpu cores 32 cpu cores 1 GPUs 2 GPUs 4 GPUs Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 46 / 46
Zhenwen Dai (Amazon) Scalable Gaussian Processes 9 September 2019 @GPSS 2019 46 / 46