Gradient Estimation for Implicit Models with Stein’s Method
Yingzhen Li Microsoft Research Cambridge
Joint work with Rich Turner, Wenbo Gong, and Jos´ e Miguel Hern´ andez-Lobato .
Gradient Estimation for Implicit Models with Steins Method Yingzhen - - PowerPoint PPT Presentation
Gradient Estimation for Implicit Models with Steins Method Yingzhen Li Microsoft Research Cambridge Joint work with Rich Turner, Wenbo Gong, and Jos e Miguel Hern andez-Lobato . A little about my research... scalability VI +
Joint work with Rich Turner, Wenbo Gong, and Jos´ e Miguel Hern´ andez-Lobato .
current methods VI + Gaussian MCMC VI + implicit dist.
1
2
K
3
K
3
4
4
φ JS[pD||pφ] = min θ max D EpD[log D(x)] + Epφ[log(1 − D(x))] true loss
true minimum 5
true loss
true minimum true gradient
true minimum 5
φ
φ
6
6
6
6
6
7
x→∞ q(x)h(x) = 0.
7
x→∞ q(x)h(x) = 0.
7
K
K
8
K
K
K
K HG + err = ∇xh. 8
K HG + err = ∇xh.
V
ˆ G∈RK×d
F + η
F, 8
K HG + err = ∇xh.
V
ˆ G∈RK×d
F + η
F,
V
k=1 ∇xk
j K(xi, xk).
8
xx′ ˆ
V (q, ˆ
V
ˆ G∈RK×d
V (q, ˆ
F 9
10
H ∝ KSD
11
12
12
Salimans et al. (2015), Song et al. (2017), Levy et al. (2018) 13
Andrychowicz et al. (2016), Li and Malik (2017), Wichrowska et al. (2017), Li and Turner (2018) 14
d
15
15
16
momentum SGD
correction
θ θ(z
θ θt ˜
p p(z
θ θt ˜
16
16
16
17
18
19
0.05 0.10 0.15 0.20 0.25 0.30 Error
Network Generalization
Adam SGD-M SGHMC NNSGHMC SGLD
0.05 0.10 0.15 0.20 0.25 0.30
Sigmoid Generalization
100 200 300 400 500 Epoch 2000 3000 4000 5000 6000 7000 8000 9000 10000
100 200 300 400 500 Epoch 2000 3000 4000 5000 6000 7000 8000 9000 10000
iter iter
20
21
22
current methods VI + Gaussian MCMC VI + implicit dist. 23
24
25