Presentation "Sticky Proposals" Presentation January 2014 - PDF document

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/278410788 Presentation "Sticky Proposals" Presentation · January 2014 CITATIONS READS 0 31 1 author: Luca Martino King Juan Carlos University 147 PUBLICATIONS 1,513 CITATIONS SEE PROFILE Some of the authors of this publication are also working on these related projects: Atmospheric Look-up table Generator (ALG) View project Scalable strategies for efficient Gaussian Process Regression View project All content following this page was uploaded by Luca Martino on 16 June 2015. The user has requested enhancement of the downloaded file.

Sticky proposal densities for adaptive MCMC methods L. Martino † ,R. Casarin ‡ , F. Leisen § , D. Luengo ¶ , † University of Helsinki, ‡ Universit´ a Ca’ Foscari, § University of Kent, ¶ Universidad Politecnica de Madrid. MCQMC, 2014 1 / 24 2014

Introduction ◮ Markov Chain Monte Carlo (MCMC) methods convert samples from a proposal pdf ˜ q ( x ) ∝ q ( x ), into correlated samples from a target pdf ˜ π ( x ) ∝ π ( x ), generating a chain. x 0 = ⇒ x 1 = ⇒ . . . x t = x t +1 = ⇒ . . . x t + τ ∼ ˜ π ( x ) ⇒ �� K ( x t | x t − 1 ) 2 / 24

Introduction ◮ Markov Chain Monte Carlo (MCMC) methods convert samples from a proposal pdf ˜ q ( x ) ∝ q ( x ), into correlated samples from a target pdf ˜ π ( x ) ∝ π ( x ), generating a chain. x 0 = ⇒ x 1 = ⇒ . . . x t = x t +1 = ⇒ . . . x t + τ ∼ ˜ π ( x ) ⇒ �� K ( x t | x t − 1 ) ◮ Within the Monte Carlo (MC) techniques: ◮ [Gilks et al. (1992)] : adaptive rejection sampling ( ARS ), ◮ [Gilks et al. (1995)] : adaptive rejection Metropolis sampling ( ARMS ), are samplers from univariate pdfs. 2 / 24

Introduction ◮ Markov Chain Monte Carlo (MCMC) methods convert samples from a proposal pdf ˜ q ( x ) ∝ q ( x ), into correlated samples from a target pdf ˜ π ( x ) ∝ π ( x ), generating a chain. x 0 = ⇒ x 1 = ⇒ . . . x t = x t +1 = ⇒ . . . x t + τ ∼ ˜ π ( x ) ⇒ �� K ( x t | x t − 1 ) ◮ Within the Monte Carlo (MC) techniques: ◮ [Gilks et al. (1992)] : adaptive rejection sampling ( ARS ), ◮ [Gilks et al. (1995)] : adaptive rejection Metropolis sampling ( ARMS ), are samplers from univariate pdfs. ◮ They are often used within Gibbs sampling. ◮ Both techniques present different limitations. 2 / 24

Introduction ◮ Markov Chain Monte Carlo (MCMC) methods convert samples from a proposal pdf ˜ q ( x ) ∝ q ( x ), into correlated samples from a target pdf ˜ π ( x ) ∝ π ( x ), generating a chain. x 0 = ⇒ x 1 = ⇒ . . . x t = x t +1 = ⇒ . . . x t + τ ∼ ˜ π ( x ) ⇒ �� K ( x t | x t − 1 ) ◮ Within the Monte Carlo (MC) techniques: ◮ [Gilks et al. (1992)] : adaptive rejection sampling ( ARS ), ◮ [Gilks et al. (1995)] : adaptive rejection Metropolis sampling ( ARMS ), are samplers from univariate pdfs. ◮ They are often used within Gibbs sampling. ◮ Both techniques present different limitations. ◮ GOAL: Overcoming these drawbacks by proposing a more general and efficient class of adaptive samplers. 2 / 24

Performance ◮ The performance of an MCMC method depends strictly on the discrepancy between proposal, q and target, π . 3 / 24

Performance ◮ The performance of an MCMC method depends strictly on the discrepancy between proposal, q and target, π . ◮ If proposal=target, we have an exact sampler. ...in a independent MH, for instance... α ≈ 1 π ( x ) q ( x ) π ( x ) π ( x ) q ( x ) q ( x ) “better” “better” x x x 3 / 24

Performance ◮ The performance of an MCMC method depends strictly on the discrepancy between proposal, q and target, π . ◮ If proposal=target, we have an exact sampler. ...in a independent MH, for instance... α ≈ 1 π ( x ) q ( x ) π ( x ) π ( x ) q ( x ) q ( x ) “better” “better” x x x ◮ Need of adapting the proposal density, while ensuring ergodicity. 3 / 24

Adaptive procedures ◮ Parametric: Learn parameters of the proposal (location and/or scale parameter). ◮ Non-parametric: Approximate the target via non-parametric procedures (as in kernel density estimation ). 4 / 24

Adaptive procedures ◮ Parametric: Learn parameters of the proposal (location and/or scale parameter). ◮ Non-parametric: Approximate the target via non-parametric procedures (as in kernel density estimation ). ◮ Simple idea: Update the proposal taking into account the histogram of the generated samples (after “burn-in”): x 1 , . . . , x t , . . . , x t + τ . . . (1 − β t ) β t proposal × random walk × x 4 / 24

other useful information ◮ We have several evaluations of the target pdf available (at least at each state of the chain). x 1 , . . . , x t , . . . , x t + τ , π ( x 1 ) , . . . , π ( x t ) , . . . , π ( x t + τ ) . ◮ Can we incorporate all this information (or a subset) in the learning procedure? 5 / 24

other useful information ◮ We have several evaluations of the target pdf available (at least at each state of the chain). x 1 , . . . , x t , . . . , x t + τ , π ( x 1 ) , . . . , π ( x t ) , . . . , π ( x t + τ ) . ◮ Can we incorporate all this information (or a subset) in the learning procedure? ◮ AIM: Interpolative construction of a proposal q which depends on a subset S t ⊂ { x 1 , . . . , x t } , q ( x ) = ˜ ˜ q t ( x ) ∝ q t ( x |S t ) . ◮ Adaptive proposal = ⇒ adaptive MCMC. 5 / 24

Interpolation procedures ◮ Consider a set of support points S t = { s 1 , . . . , s m t } , and V ( x ) = log[ π ( x )] , W t ( x ) = log[ q t ( x |S t )] . ◮ Interpolation procedure: W t ( x ) p ( x ) V ( x ) π ( x ) W t ( x ) V ( x ) q t ( x |S t ) " t ( x ) s 1 s 2 s 3 s 4 s 5 s 1 s 2 s 3 s 4 s 5 s 1 s 2 s 3 s 4 s 5 s 6 (a) P2 : log-domain (b) P3 : log-domain (c) P4 : pdf-domain 6 / 24

Interpolation procedures ◮ Similar to the constructions in the adaptive rejection sampling (ARS) [Gilks et al., 1992] and adaptive rejection Metropolis sampling (ARMS) methods [Gilks et al., 1995] . w 2 ( x ) W t ( x ) V ( x ) W t ( x ) w 1 ( x ) V ( x ) w 3 ( x ) s 1 s 3 s 4 s 5 s 2 s 1 s 2 s 3 s 6 (d) log-domain (ARS) (e) P1 : log-domain (ARMS) ◮ ARS: only for log-concave pdfs. ◮ ARMS: sometimes incomplete adaptation. 7 / 24

Interpolation procedures (f) P4 : |S t | = 6 (g) P4 : |S t | = 7 (h) P4 : |S t | = 8 (i) P4 : |S t | = 9 (j) P4 : |S t | > 100 ◮ Here the points are not adaptively chosen. 8 / 24

Drawing from q t 1. Calculate analytically the area below each piece, i.e., � s j +1 q t ( x |S t ) dx = A j , j = 0 , . . . , m t , s j denoting s 0 = −∞ and s m t +1 = + ∞ . 2. Choose a j ∗ -th piece according to A j ω j = j = 0 , . . . , m t . , � n j =1 A j 3. Draw a sample x ′ from q t ( x |S t ) with x ∈ ( s j ∗ , s j ∗ +1 ). P2 → exponential pieces P3 → uniform pieces P4 → linear pieces 9 / 24

Computational cost - efficiency ◮ More points: better approximation of the target ⇒ more efficiency (i.e., less correlation ⇔ faster convergence). ◮ More points: to draw from q t is more costly. m t ↑ = ⇒ efficiency ↑ + computational cost ↑ ◮ Desired adaptive strategy: manage the set S t in order to build a “good” proposal with a small number m t of points, keeping the ergodicity of the sampler. 10 / 24

Adaptive Sticky Metropolis (ASM) 1. Construction of the proposal: Build a proposal q t ( x |S t ), using the set S t = { s 1 , . . . , s m t } (e.g., using P1 , P2 , P3 and P4 ). 11 / 24

Adaptive Sticky Metropolis (ASM) 1. Construction of the proposal: Build a proposal q t ( x |S t ), using the set S t = { s 1 , . . . , s m t } (e.g., using P1 , P2 , P3 and P4 ). 2. MH step: 2.1 Draw x ′ from ˜ q t ( x ) ∝ q t ( x |S t ). 2.2 Set x t +1 = x ′ and z = x t with probability α = 1 ∧ π ( x ′ ) q t ( x t |S t ) π ( x t ) q t ( x ′ |S t ) , and set x t +1 = x t and z = x ′ , with probability 1 − α . 11 / 24

Adaptive Sticky Metropolis (ASM) 1. Construction of the proposal: Build a proposal q t ( x |S t ), using the set S t = { s 1 , . . . , s m t } (e.g., using P1 , P2 , P3 and P4 ). 2. MH step: 2.1 Draw x ′ from ˜ q t ( x ) ∝ q t ( x |S t ). 2.2 Set x t +1 = x ′ and z = x t with probability α = 1 ∧ π ( x ′ ) q t ( x t |S t ) π ( x t ) q t ( x ′ |S t ) , and set x t +1 = x t and z = x ′ , with probability 1 − α . 3. Test to update S t : Set S t +1 = S t ∪ { z } with prob. P a = η ( d t ( z )) , otherwise S t +1 = S t . 11 / 24

Adaptive Sticky Metropolis (ASM) 1. Construction of the proposal: Build a proposal q t ( x |S t ), using the set S t = { s 1 , . . . , s m t } (e.g., using P1 , P2 , P3 and P4 ). 2. MH step: 2.1 Draw x ′ from ˜ q t ( x ) ∝ q t ( x |S t ). 2.2 Set x t +1 = x ′ and z = x t with probability α = 1 ∧ π ( x ′ ) q t ( x t |S t ) π ( x t ) q t ( x ′ |S t ) , and set x t +1 = x t and z = x ′ , with probability 1 − α . 3. Test to update S t : Set S t +1 = S t ∪ { z } with prob. P a = η ( d t ( z )) , otherwise S t +1 = S t . ◮ d t ( z ) ⇒ a positive measure of the distance in z between the q t and π . 11 / 24

Presentation "Sticky Proposals" Presentation January 2014 - PDF document

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/278410788 Presentation "Sticky Proposals" Presentation January 2014 CITATIONS READS 0 31 1 author: Luca Martino King

Modest Proposals for Modest Proposals for Modest Proposals for Modest Proposals for Modest

protoDUNE Wenqiang Gu Brookhaven National Laboratory 1 Sticky Code The 6 LSBs in ADC ASIC

Bottom Up In Bloomsbury - open, local and sticky! open, local and sticky! Bloomsbury Village

Improving Social Mobility Removing the Sticky Floors and Sticky Ceilings Karen Iles

Revised Revised Revised Proposals Revised Proposals Proposals Proposals for Development

STICKY HEELZ COMPANY PRESENTATION OVERVIEW OF BUSINESS Company LMB Enterprises Limited t/a

Sun Safe Presentation for Sticky Fingers Staff. Sun Safe Nursery Accreditation Each year at

Sticky Information in General Equilibrium N. Gregory Mankiw and Ricardo Reis Harvard

MA Macroeconomics 9. Sticky Prices and the Phillips Curve Karl Whelan School of Economics, UCD

Sticky particles with interaction Giuseppe Savar e http://www.imati.cnr.it/ savare

Cayman Island Substitution Proposals 15 November 2018 The Proposals Background to the Proposals

Panel 3 - FIGHTING Proposals to JJIF Board/Session Proposals to JJEU Board 8 topics discussed

How to Write Better Proposals How to Write Better Proposals Commodore Consulting, LLC (c) 2015

Competitive Proposals 75 Contents Competitive Proposals (RFPs, RFQs) Procurement of

Grant Proposals Grant proposals ... Which funding agency? AMS/Simons Foundation grants

Grant Proposals Focus is on federal (US) grant proposals Some funding agencies abroad: DAAD,

Alpha Presentation Ford Qwikboard The Capstone Experience Team Ford Luke Davis Jin Hou

1 Introduction Individual consumer and producer prices change every six months to one year. 1 In

Activity 1 evidence on a sticky note Take a quick moment to write down the most innovative

1 I have divided the recruitment and reten0on process into

The Transmission of Monetary Policy through Redistributions and Durable Purchases Vincent Sterk

Evaluation and complexity Presentation for the Big Lottery Fund London, 22 February 2018 Marcus

Zoom Housekeeping This call is being recorded Developing Workflows

Step 1 Everyone Choose your storage tool. You can store your dies in our Die-namic Storage, in