Slice sampling Dr. Jarad Niemi STAT 615 - Iowa State University - - PowerPoint PPT Presentation

slice sampling
SMART_READER_LITE
LIVE PREVIEW

Slice sampling Dr. Jarad Niemi STAT 615 - Iowa State University - - PowerPoint PPT Presentation

Slice sampling Dr. Jarad Niemi STAT 615 - Iowa State University November 14, 2017 Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 1 / 26 Slice sampling Slice sampling Suppose the target distribution is p ( | y ) with scalar


slide-1
SLIDE 1

Slice sampling

  • Dr. Jarad Niemi

STAT 615 - Iowa State University

November 14, 2017

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 1 / 26

slide-2
SLIDE 2

Slice sampling

Slice sampling

Suppose the target distribution is p(θ|y) with scalar θ. Then, p(θ|y) = p(θ|y) 1 du Thus, p(θ|y) can be thought of as the marginal distribution of (θ, U) ∼ Unif{(θ, u) : 0 < u < p(θ|y)} where u is an auxiliary variable. Slice sampling performs the following Gibbs sampler:

  • 1. ut|θt−1, y ∼ Unif{u : 0 < u < p(θt−1|y)} and
  • 2. θt|ut, y ∼ Unif{θ : ut < p(θ|y)}.

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 2 / 26

slide-3
SLIDE 3

Slice sampling

Slice sampler for exponential distribution

Consider the target θ|y ∼ Exp(1),then {θ : u < p(θ|y)} = (0, − log(u)).

0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0

Target disribution

θ u

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 3 / 26

slide-4
SLIDE 4

Slice sampling

Slice sampling in R

slice = function(n,init_theta,target,A) { u = theta = rep(NA,n) theta[1] = init_theta u[1] = runif(1,0,target(theta[1])) # This never actually gets used for (i in 2:n) { u[i] = runif(1,0,target(theta[i-1])) endpoints = A(u[i],theta[i-1]) # The second argument is used in the second example theta[i] = runif(1, endpoints[1],endpoints[2]) } return(list(theta=theta,u=u)) } set.seed(6) A = function(u,theta=NA) c(0,-log(u)) res = slice(10, 0.1, dexp, A) Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 4 / 26

slide-5
SLIDE 5

Slice sampling Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 5 / 26

0.0 0.5 1.0 1.5 2.0 0.0 0.2 0.4 0.6 0.8

Slice sampling an Exp(1) distribution

θ u

slide-6
SLIDE 6

Slice sampling

Histogram of draws

Slice sampling approximation to Exp(1) distribution

θ Density 2 4 6 8 10 0.0 0.2 0.4 0.6 0.8

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 6 / 26

slide-7
SLIDE 7

Slice sampling Posterior

Normal model with unknown mean

Let Yi

ind

∼ N(θ, 1) and θ ∼ La(0, 1) then p(θ|y) ∝ n

  • i=1

N(yi; θ, 1)

  • La(θ; 0, 1)

n = 5 y = rnorm(n,.2) f = Vectorize(function(theta, y.=y) exp(sum(dnorm(y., theta, log=TRUE)) + dexp(abs(theta), log=TRUE))) # Function to numerically find endpoints A = function(u, xx, f.=f) { left_endpoint = uniroot(function(x) f.(x) - u, c(-10^10, xx)) right_endpoint = uniroot(function(x) f.(x) - u, c( 10^10, xx)) c(left_endpoint$root, right_endpoint$root) } res = slice(20, mean(y), f, A) Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 7 / 26

slide-8
SLIDE 8

Slice sampling Posterior

Slice sampling using numerically calculated endpoints

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 8 / 26

−0.4 −0.2 0.0 0.2 0.4 0e+00 1e−04 2e−04 3e−04 4e−04

Slice sampling a posterior

θ u

slide-9
SLIDE 9

Slice sampling Posterior

Histogram of draws

Slice sampling approximation to posterior

θ Density −2.0 −1.5 −1.0 −0.5 0.0 0.5 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 9 / 26

slide-10
SLIDE 10

Slice sampling Learning the endpoints

An alternative augmentation

Suppose Yi

ind

∼ N(θ, 1) and θ ∼ La(0, 1) but now, we will use the augmentation p(u, θ) ∝ p(θ)I(0 < u < p(y|θ)) The full conditional distributions are now

  • 1. u|θ, y ∼ Unif(0, p(y|θ)) and
  • 2. θ|u, y ∼ p(θ)I(u < p(y|θ)).

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 10 / 26

slide-11
SLIDE 11

Slice sampling Learning the endpoints

Sampling θ|u, y

Now we need to sample from p(θ)I(u < p(y|θ)). If p(θ) is unimodal, then this is equivalent to p(θ)I(θL(u) < θ < θU(u)) for some bounds θL(u) and θU(u) which depend on u. One way to learn these is to sample from p(θ) and update the bounds, e.g. if θ(i−1) is our current value in the chain, we know u < p(y|θ(i−1)) or, equivalently, θL(u) < θ(i−1) < θU(u). Letting u(i) be the current value for the auxiliary variable and setting θL(u(i)) [θU(u(i))] to the lower [upper] bound of the support for θ, we can

  • 1. Sample θ∗ ∼ p(θ)I(θL(u(i)) < θ < θU(u(i))).
  • 2. Set θ(i) = θ∗ if u(i) < p(y|θ∗), otherwise
  • a. set θL(u(i)) = θ∗ if θ∗ < θ(i−1) or
  • b. set θU(u(i)) = θ∗ if θ∗ > θ(i−1) and
  • c. return to Step 1.

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 11 / 26

slide-12
SLIDE 12

Slice sampling Learning the endpoints

Learning the endpoints

−2 −1 1 2 0.05 0.15 0.25 0.35 θ u Current Proposed

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 12 / 26

slide-13
SLIDE 13

Slice sampling Learning the endpoints

R code

slice2 = function(n, init_theta, like, qprior) { u = theta = rep(NA, n) theta[1] = init_theta u[1] = runif(1, 0, like(theta[1])) for (i in 2:n) { u[i] = runif(1, 0, like(theta[i - 1])) success = FALSE endpoints = 0:1 while (!success) { # Inverse CDF up = runif(1, endpoints[1], endpoints[2]) theta[i] = qprior(up) if (u[i] < like(theta[i])) { success = TRUE } else { # Updated endpoints when proposed value is rejected if (theta[i] > theta[i - 1]) endpoints[2] = up if (theta[i] < theta[i - 1]) endpoints[1] = up } } } return(list(theta = theta, u = u)) } Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 13 / 26

slide-14
SLIDE 14

Slice sampling Learning the endpoints

Histogram

Slice sampling approximation to posterior

θ Density −2 −1 1 0.0 0.2 0.4 0.6 0.8 1.0

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 14 / 26

slide-15
SLIDE 15

Slice sampling Stepping-out slice sampling

Bimodal target distributions

Consider the posterior p(θ|y) = 1 2N(θ; −2, 1) + 1 2N(θ; 2, 1)

−4 −2 2 4 0.00 0.05 0.10 0.15 0.20 x target(x)

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 15 / 26

slide-16
SLIDE 16

Slice sampling Stepping-out slice sampling

Stepping-out slice sampler

To sample from θ|u, y, let θ(i−1) be the current draw for θ u(1) be the current draw for the auxiliary variable u w be a tuning parameter that you choose Perform the following

  • 1. Randomly place an interval (θL(u(i)), θU(u(i))) of length w around

the current value θ(i−1).

  • 2. Step the endpoints of this interval out in increments of w until

u(i) > p(θL(u(i))|y) and u(i) > p(θB(u(i))|y).

  • 3. Sample θ∗ ∼ Unif(θL(u(i)), θL(u(i))).
  • 4. If u(i) < p(θ∗|y), then set θ(i) = θ∗, otherwise
  • a. set θL(u(i)) = θ∗ if θ∗ < θ(i−1) or
  • b. set θU(u(i)) = θ∗ if θ∗ > θ(i−1) and
  • c. return to Step 3.

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 16 / 26

slide-17
SLIDE 17

Slice sampling Stepping-out slice sampling

Current Endpoints Current Endpoints

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 17 / 26

slide-18
SLIDE 18

Slice sampling Stepping-out slice sampling create_interval = function(theta, u, target, w, max_steps) { L = theta - runif(1,0,w) R = L + w # Step out J = floor(max_steps * runif(1)) K = (max_steps - 1) - J while ((u < target(L)) & J > 0) { L = L - w J = J - 1 } while ((u < target(R)) & K > 0) { R = R + w K = K - 1 } return(list(L=L,R=R)) } shrink_and_sample = function(theta, u, target, int) { L = int$L; R = int$R repeat { theta_prop = runif(1, L, R) if (u < target(theta_prop)) return(theta_prop) # shrink if (theta_prop > theta) R = theta_prop if (theta_prop < theta) L = theta_prop } } Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 18 / 26

slide-19
SLIDE 19

Slice sampling Stepping-out slice sampling slice = function(n, init_theta, target, w, max_steps) { u = theta = rep(NA, n) theta[1] = init_theta for (i in 2:n) { u[i] = runif(1, 0, target(theta[i - 1])) theta[i] = shrink_and_sample(theta = theta[i-1], u = u[i], target = target, int = create_interval(theta = theta[i-1], u = u[i], target = target, w = w, max_steps = max_steps)) } return(data.frame(theta = theta, u = u)) } Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 19 / 26

slide-20
SLIDE 20

Slice sampling Stepping-out slice sampling

Sampling from mixture of normals

res = slice(n = 1e4, init_theta=0, target=target, w=1, max_steps=10)

Stepping out slice sampler for bimodal target

θ target −4 −2 2 4 0.00 0.05 0.10 0.15 0.20

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 20 / 26

slide-21
SLIDE 21

Slice sampling Doubling slice sampler

Doubling slice sampler

To sample from θ|u, y, let θ(i−1) be the current draw for θ u(1) be the current draw for the auxiliary variable u w be a tuning parameter that you choose Perform the following

  • 1. Randomly place an interval (θL(u(i)), θU(u(i))) of length w around

the current value θ(i−1).

  • 2. Randomly double the size of the interval to either the left or right

until u(i) > p(θL(u(i))|y) and u(i) > p(θB(u(i))|y).

  • 3. Sample θ∗ ∼ Unif(θL(u(i)), θL(u(i))).
  • 4. If u(i) < p(θ∗|y) and a reversibility criterion is satisfied, then set

θ(i) = θ∗, otherwise

  • a. set θL(u(i)) = θ∗ if θ∗ < θ(i−1) or
  • b. set θU(u(i)) = θ∗ if θ∗ > θ(i−1) and
  • c. return to Step 3.

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 21 / 26

slide-22
SLIDE 22

Slice sampling Doubling slice sampler

Current Endpoints Current Endpoints

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 22 / 26

slide-23
SLIDE 23

Slice sampling Doubling slice sampler

Reversibility criterion

This procedure works backward through the intervals that the doubling procedure would pass through to arrive at [the doubled interval] when starting from the new point, checking that none of [the intermediate intervals] has both ends outside the slice, which would lead to earlier termination of the doubling procedure.

accept = function(theta0, theta1, L, R, u, w) { D = FALSE while (R - L > 1.1 * w) { M = (L + R)/2 if ((theta0 < M & theta1 >= M) | (theta0 >= M & theta1 < M)) D = TRUE if (theta1 < M) { R = M } else { L = M } if (D & u >= target(L) & u >= target(R)) { return(FALSE) } } return(TRUE) } Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 23 / 26

slide-24
SLIDE 24

Slice sampling Doubling slice sampler slice = function(n, init_theta, target, w, max_doubling) { u = theta = rep(NA, n) theta[1] = init_theta for (i in 2:n) { u[i] = runif(1, 0, target(theta[i - 1])) L = theta[i - 1] - runif(1, 0, w) R = L + w # Step out K = max_doubling while ((u[i] < target(L) | u[i] < target(R)) & K > 0) { if (runif(1) < 0.5) { L = L - (R - L) } else { R = R + (R - L) } K = K - 1 } # Sample and shrink repeat { theta[i] = runif(1, L, R) if (u[i] < target(theta[i]) & accept(theta[i - 1], theta[i], L, R, u[i], w)) break # shrink if (theta[i] > theta[i - 1]) R = theta[i] if (theta[i] < theta[i - 1]) L = theta[i] } } return(list(theta = theta, u = u)) } Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 24 / 26

slide-25
SLIDE 25

Slice sampling Doubling slice sampler

Doubling slice sampler for bimodal target

res = slice(n=1e4, init_theta=0, target=target, w=1, max_doubling=10)

Stepping out slice sampler for bimodal target

θ target −4 −2 2 4 0.00 0.05 0.10 0.15 0.20

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 25 / 26

slide-26
SLIDE 26

Slice sampling Multivariate slice sampling

Multivariate slice sampling

Suppose, we are interested in sampling from p(θ1, θ2|y) = p(θ1,θ2|y) 1 du Treat each variable independently, i.e.

  • 1. u|θ1, θ2, y ∼ Unif(0, p(θ1, θ2|y))
  • 2. θ1|u, θ2, y ∼ Unif(u < p(θ1, θ2|y))
  • 3. θ2|u, θ1, y ∼ Unif(u < p(θ1, θ2|y))

Use overrelaxation to avoid random walk behavior

Hyperrectangle slice sampling

Replace interval constructed from w with a hyperrectangle W placed randomly over the slice Shrink as points are rejected

Reflective slice sampling

Candidate samples are kept within the bounds by reflecting the direction of sampling when the boundary is hit

Jarad Niemi (STAT615@ISU) Slice sampling November 14, 2017 26 / 26