Adaptive rejection Metropolis sampling Dr. Jarad Niemi STAT 615 - - - PowerPoint PPT Presentation

adaptive rejection metropolis sampling
SMART_READER_LITE
LIVE PREVIEW

Adaptive rejection Metropolis sampling Dr. Jarad Niemi STAT 615 - - - PowerPoint PPT Presentation

Adaptive rejection Metropolis sampling Dr. Jarad Niemi STAT 615 - Iowa State University November 14, 2017 Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 1 / 20 (Logarithmically) Concave Univariate Function


slide-1
SLIDE 1

Adaptive rejection Metropolis sampling

  • Dr. Jarad Niemi

STAT 615 - Iowa State University

November 14, 2017

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 1 / 20

slide-2
SLIDE 2

(Logarithmically) Concave Univariate Function

A function p(θ) is concave if p((1 − t)x + t y) ≥ (1 − t)p(x) + t p(y) for any 0 ≤ t ≤ 1.

x y p(x) p(y)

If p(x) is twice differentiabe, then p(x) is concave if and only if p′′(x) ≤ 0. A function p(x) is log-concave if log p(x) is concave.

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 2 / 20

slide-3
SLIDE 3

Examples

X ∼ N(0, 1) has a log-concave density since d2 dx2 log e−x2/2 = d2 dx2 − x2/2 = d dx − x = −1. X ∼ Ca(0, 1) has a non-log-concave density since d2 dx2 log 1 1 + x2 = d dx −2x 1 + x2 = 2(x2 − 1) (1 + x2)2 .

1e−05 1e−03 1e−01 1 2 3 4 5

x density distribution

cauchy exponential normal

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 3 / 20

slide-4
SLIDE 4

Log-concave distributions

Log-concave distributions

normal exponential Uniform Laplace Gamma (shape parameter is ≥ 1) Wishart (n ≥ p + 1) Dirichlet (all parameters ≥ 1)

Non-log-concave distributions

Log-normal Student t F-distribution

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 4 / 20

slide-5
SLIDE 5

Exponential distribution

An exponential distribution has pdf p(θ; b) = be−bθ and thus has log-density log p(θ; b) = log(b) − bθ which is trivially log-concave since d2 dθ2 log(b) − bθ = d dθ − b = 0 ≤ 0. The exponential distribution, or exponential function, is unique in that it matches the bound for the definition of log-concavity.

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 5 / 20

slide-6
SLIDE 6

Prior-posterior example

The product of log-concave functions is also log-concave since log n

  • i=1

pi(x)

  • =

n

  • i=1

log pi(x). Assume Yi

ind

∼ N(θ, 1) and θ ∼ La(0, 1) then the posterior p(θ|y) ∝ n

  • i=1

N(yi; θ, 1)

  • La(θ; 0, 1)

is log-concave since - N(yi; θ, 1) is a log-concave function for θ for each yi and - La(θ; 0, 1) is a log-concave distribution.

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 6 / 20

slide-7
SLIDE 7

Rejection sampling

Suppose we are interested in sampling from a target distribution p(θ|y) using a proposal q(θ). To use this algorithm, we must find M ≥ p(θ|y) q(θ) ∀θ where the optimal M is supθp(θ|y)/q(θ). Rejection sampling performs the following

  • 1. Sample θ ∼ q(θ).
  • 2. Accept θ as a draw from p(θ|y) with probability

1 M p(θ|y) q(θ)

  • therwise return to step 1.

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 7 / 20

slide-8
SLIDE 8

Rejection sampling envelope

Target N+(0, 1) and proposal Exp(1). Then

  • 2/πe−θ2/2

e−θ ≤ 1.315489 = M

0.0 0.5 1.0 1 2 3

x density distribution

Exp(1) M*Exp(1) N^+(0,1) 0.01 0.10 1.00 1 2 3

x log density

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 8 / 20

slide-9
SLIDE 9

Rejection sampling example

0.0 0.3 0.6 0.9 1 2 3 4 5

x M*Exp(x;1) accept

FALSE TRUE

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 9 / 20

slide-10
SLIDE 10

Adaptive rejection sampling

Idea: build a piece-wise linear envelope to the log-density as a proposal distribution

log density density Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 10 / 20

slide-11
SLIDE 11

Pseudo-algorithm for adaptive rejection sampling

  • 1. Choose starting locations θ, call the set Θ
  • 2. Construct piece-wise linear envelope log q(θ) to the log-density
  • a. Calculate log f(θ|y) and (log f(θ|y))′.
  • b. Find line intersections
  • 3. Sample a proposed value θ∗ from the envelope q(θ)
  • a. Sample an interval
  • b. Sample a truncated (and possibly negative of an) exponential r.v.
  • 4. Perform rejection sampling
  • a. Sample u ∼ Unif(0, 1)
  • b. Accept θ∗ if u ≤ f(θ∗|y)/q(θ∗).
  • 5. If rejected, add θ∗ to Θ and return to 2.

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 11 / 20

slide-12
SLIDE 12

Adaptive rejection sampling (ARS) in R

library(ars) f = function(x) -x^2/2 # log of standard normal density fp = function(x) -x # derivative of log of standard normal density x = ars(1e4, f, fp)

ARS samples

x Density −2 2 4 0.0 0.1 0.2 0.3 0.4

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 12 / 20

slide-13
SLIDE 13

ARS in R - non-log-concave density

f = function(x) log(1/(1+x^2)) # log of standard cauchy density fp = function(x) -2*x/(1+x^2) # derivative of log of cauchy density x = ars(1e4, f, fp) ## ## Error in sobroutine initial_... ## ifault= 5 Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 13 / 20

slide-14
SLIDE 14

ARS in R - prior-posterior example

Yi

ind

∼ N(θ, 1) and θ ∼ La(0, 1)

y = rnorm(10) f = Vectorize(function(theta) sum(-(y-theta)^2/2) - abs(theta)) fp = Vectorize(function(theta) sum((y-theta)) - (theta>0) + (theta<0)) x = ars(1e4, f, fp) Posterior for Normal data with Laplace prior on mean

x Density −1.0 −0.5 0.0 0.5 1.0 0.0 0.5 1.0 1.5

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 14 / 20

slide-15
SLIDE 15

Comments on ARS

Derivative free ARS Checking for log-concavity

Decreasing derivatives

Initial points for unbounded support:

initial derivative must be positive final derivative must be negative

Lower bound for multiple samples

Connect points

Probability of acceptance increases at subsequent steps

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 15 / 20

slide-16
SLIDE 16

Adaptive rejection Metropolis sampling (ARMS)

Adaptive rejection sampling is only suitable for log-concave densities. For non-log-concave densities adaptive rejection Metropolis sampling can be used

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 16 / 20

slide-17
SLIDE 17

ARMS algorithm

  • 1. Choose starting locations for θ, call the set Θ.
  • 2. Construct piece-wise linear pseudo-envelope log q(θ) to log p(θ|y).
  • 3. Sample θ∗ ∼ q(θ) and U ∼ Unif(0, 1).
  • a. If U ≤ p(θ∗|y)/q(θ∗), proceed to Step 4.
  • b. Otherwise, add θ∗ to Θ and return to 3.
  • 4. Perform Metropolis step: Set θ(i) = θ∗ with probability

min

  • 1, p(θ∗|y)

p(θ(i)|y) min{p(θ(i−1)|y), q(θ(i−1))} min{p(θ∗|y), q(θ∗)}

  • therwise set θ(i) = θ(i−1).

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 17 / 20

slide-18
SLIDE 18

ARMS pseudo-envelope

−4 −2 2 4 −5 −4 −3 −2 −1

t_5

x f(x, v)

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 18 / 20

slide-19
SLIDE 19

ARMS in R

f = function(x,mean) -(x-mean)^2/2 x = dlm::arms(runif(1,3,17), f, function(x,mean) ((x-mean)>-7)*((x-mean)<7), 1e4, mean=10) hist(x,101,prob=TRUE,main="Gaussian(10,1)") curve(dnorm(x,10), add=TRUE, lwd=2, col='red')

Gaussian(10,1)

x Density 6 8 10 12 14 0.0 0.1 0.2 0.3 0.4

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 19 / 20

slide-20
SLIDE 20

Theoretical consideration of ARMS

ARMS is an independent Metropolis-Hastings algorithm

Proposal changes, due to updating q, i.e. adding more points in to Θ, thus inhomogenous. We need to stop updating q at some point to enforce homogeneity.

Jarad Niemi (STAT615@ISU) Adaptive rejection Metropolis sampling November 14, 2017 20 / 20