Deterministic Approximations 2
Laplace and variational approximations
Iain Murray http://iainmurray.net/
Posterior distributions
p(θ|D, M) = P(D|θ) p(θ) P(D|M) E.g., logistic regression:
p(θ) = N(θ; 0, σ2I) P(D|θ) =
- σ(z(n)w⊤x(n)),
labels z(n) ∈ ±1 Integrate large product non-linear functions.
Goals: summarize posterior in simple form, estimate model evidence P(D|M)
Non-Gaussian example
p(w) ∝ N(w; 0, 1) p(w|D) ∝ N(w; 0, 1) σ(10 − 20w)
−4 −2 2 4
Posterior after 500 datapoints
N =50 labels generated with w=1 at x(n) ∼ N(0, 102)
p(w) ∝ N(w; 0, 1) p(w|D) ∝ N(w; 0, 1)
500
- n=1
σ(wx(n)z(n))
−4 −2 2 4 −4 −2 2 4
Gaussian fit overlaid