Sampling multimodal densities in high dimensional sampling space - PowerPoint PPT Presentation

Sampling multimodal densities in high dimensional sampling space Sampling multimodal densities in high dimensional sampling space Gersende FORT LTCI, CNRS & Telecom ParisTech Paris, France Journ´ ees MAS Toulouse, Aoˆ ut 2014

Sampling multimodal densities in high dimensional sampling space Introduction Sample from a target distribution π dλ on X ⊆ R ℓ , when π is (possibly) known up to a normalizing constant, ֒ → Hereafter, to make the notations simpler, π is assumed to be normalized and in the context π is multimodal large dimension Research guided by Computational Bayesian Statistics π : the a posteriori distribution, known up to a normalizing constant Needed: algorithms to explore π , to compute expectations w.r.t. π , · · · .

Sampling multimodal densities in high dimensional sampling space Introduction Talk based on joint works with Eric Moulines, Amandine Schreck (Telecom ParisTech) Pierre Priouret (Paris VI) Benjamin Jourdain, Tony Leli` evre, Gabriel Stoltz (ENPC) Estelle Kuhn (INRA)

Sampling multimodal densities in high dimensional sampling space Introduction Outline Introduction Usual Monte Carlo samplers The proposal mecanism Adaptive Monte Carlo samplers Conclusion Tempering-based Monte Carlo samplers Biasing Potential-based Monte Carlo sampler Convergence Analysis

Sampling multimodal densities in high dimensional sampling space Introduction Usual Monte Carlo samplers Usual Monte Carlo samplers Markov chain Monte Carlo (MCMC) 1 Sample a Markov chain ( X k ) k having π as unique invariant distribution Approximation: n π ≈ 1 � δ X k n k =1 Example: Hastings-Metropolis algorithm with proposal kernel q ( x,y ) given X k , sample Y ∼ q ( X k , · ) accept-reject mecanism � π ( Y ) q ( Y,X k ) with probability 1 ∧ Y X k +1 = π ( X k ) q ( X k ,Y ) X k otherwise

Sampling multimodal densities in high dimensional sampling space Introduction Usual Monte Carlo samplers Usual Monte Carlo samplers Markov chain Monte Carlo (MCMC) 1 Sample a Markov chain ( X k ) k having π as unique invariant distribution Approximation: n π ≈ 1 � δ X k n k =1 Example: Hastings-Metropolis algorithm with proposal kernel q ( x,y ) given X k , sample Y ∼ q ( X k , · ) accept-reject mecanism � π ( Y ) q ( Y,X k ) with probability 1 ∧ Y X k +1 = π ( X k ) q ( X k ,Y ) X k otherwise Importance Sampling (IS) 2 Sample i.i.d. points ( X k ) k with density q - proposal distribution chosen by the user Approximation: n π ≈ 1 π ( X k ) � q ( X k ) δ X k n k =1

Sampling multimodal densities in high dimensional sampling space Introduction The proposal mecanism The proposal mecanism: MCMC Toy example in the case: Hastings-Metropolis algorithm with Gaussian proposal kernel � � − 1 2( y − x ) T Σ − 1 ( y − x ) q ( x,y ) ∝ exp π ( Y ) Acceptance-rejection ratio: 1 ∧ π ( X k ) 3 2.5 3 2 2 2 1.5 1 1 1 0 0 0.5 −1 −1 0 −2 −2 −0.5 −3 −1 −3 0 500 1000 0 500 1000 0 500 1000 1 1 1.2 1 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 −0.2 0 50 100 0 50 100 0 50 100 Fig. : For three different values of Σ : [top] Plot of the chain (in R );[bottom] autocorrelation function

Sampling multimodal densities in high dimensional sampling space Introduction The proposal mecanism The proposal mecanism: Importance Sampling (1/2) Toy example: � 1 compute | x | π ( x ) dx when π ( x ) ∼ t (3) ∝ (1 + x 2 3 ) 2 R Consider in turn the proposal q equal to a Student t (1) and then to a Normal N (0 , 1) 0.4 1.8 1.8 0.35 1.6 1.6 1.4 1.4 0.3 1.2 1.2 0.25 1 1 0.2 0.8 0.8 0.15 0.6 0.6 0.1 0.4 0.4 0.2 0.2 0.05 100 500 1000 1500 100 500 1000 1500 Nbr of samples Nbr of samples when q ∼ t(1) when q ∼ N(0,1) 0 −8 −6 −4 −2 0 2 4 6 8 Boxplot computed from 100 runs of the algorithm Plot of the densities q (green, blue) and π (in red)

Sampling multimodal densities in high dimensional sampling space Introduction The proposal mecanism The proposal mecanism: Importance Sampling (2/2) The efficiency of the algorithm depends upon the proposal distribution q : if few large weights and the others negligible, the approximation is likely not accurate Monitoring the convergence: there exist criteria measuring the proportion of “ineffective draws”: Coefficient of Variation Effective Sample Size Normalized perplexity

Sampling multimodal densities in high dimensional sampling space Introduction Adaptive Monte Carlo samplers Adaptive Monte Carlo samplers To fix some design parameters and make the samplers more efficient: adaptive Monte Carlo samplers were proposed Adaptive Algorithms: - The optimal design parameters are defined as the solutions of an optimality criterion. In practice, it can not be solved explicitly. - Based on the past history of the sampler , solve an approximation of this criterion and compute the design parameters for the current run of the samplers. - Repeat the scheme: adaption/sampling.

Sampling multimodal densities in high dimensional sampling space Introduction Adaptive Monte Carlo samplers Adaptive MC sampler: example of adaptive MCMC (1/2) Adaptive Hastings-Metropolis algorithm with Gaussian proposal distribution � � − 1 2( y − x ) T Σ − 1 ( y − x ) q Σ ( x,y ) ∝ exp Design parameters: the covariance matrix Σ Optimal criterion: by using the scaling approach for Markov Chains, it is advocated pioneering work: Roberts, Gelman, Gilks (1997) Σ = (2 . 38) 2 × covariance of π ℓ Iterative algorithm Haario, Saksman, Tamminen (2001) Adaption Update the covariance matrix Σ t = (2 . 38) 2 Σ ( π ) × � t ℓ Sampling one step of a Hastings-Metropolis algorithm with proposal q Σ t to sample X t +1 .

Sampling multimodal densities in high dimensional sampling space Introduction Adaptive Monte Carlo samplers Adaptive MC sampler: example of adaptive MCMC (2/2) Nevertheless, this receipe is not designed for any context. Example: multimodality Target distribution: mixture of 20 Gaussian in R 2 . The means of Adaptive Hastings Metropolis: 5 10 6 draws the Gaussians are indicated with a red cross. 5 10 6 i.i.d. draws

Sampling multimodal densities in high dimensional sampling space Introduction Adaptive Monte Carlo samplers Adaptive MC sampler: example of Adaptive Importance Sampling (1/2) Design parameter: the proposal distribution Optimal criterion: choose the proposal density q among a (parametric) family Q as the solution of � � π ( x ) � � argmin q ∈Q log π ( x ) λ ( dx ) ⇐ ⇒ argmax q ∈Q log q ( x ) π ( x ) λ ( dx ) q ( x ) Iterative algorithm: O. Capp´ e, A. Guillin, J.M. Marin, C.Robert (2004) Adaption Update the sampling distribution n π ( X ( t − 1) � ) 1 log q ( X ( t − 1) k q t = argmax q ∈Q ) k q t − 1 ( X ( t − 1) n ) k =1 k Sampling Draw points ( X ( t ) k ) k + importance reweighting n π ( X ( t ) � π ≈ 1 k ) δ X ( t ) n q t ( X ( t ) k ) k k =1

Sampling multimodal densities in high dimensional sampling space Introduction Adaptive Monte Carlo samplers Adaptive MC sampler: example of Adaptive Importance Sampling (2/2) Nevertheless, it is known that such Importance Sampling techniques are not robust to the dimension: when sampling on R ℓ with ℓ > 15 , the degeneracy of the importance ratios π ( X k ) q ( X k ) can not be avoided.

Sampling multimodal densities in high dimensional sampling space Introduction Conclusion Conclusion Usual adaptive Monte Carlo samplers are not robust (enough) to the context of • multimodality of the target distribution π : how to jump from modes to modes. • large dimension of the sampling space π ( x ) Importance Sampling: q ( x ) 1 ∧ π ( y ) q ( y,x ) π ( x ) q ( x,y ) = 1 ∧ π ( y ) MCMC: π ( x ) when q is a symetric kernel New Monte Carlo samplers combine tempering techniques and/or biasing potential techniques and sampling steps.

Sampling multimodal densities in high dimensional sampling space Tempering-based Monte Carlo samplers Outline Introduction Tempering-based Monte Carlo samplers The Equi-Energy sampler Biasing Potential-based Monte Carlo sampler Convergence Analysis

Sampling multimodal densities in high dimensional sampling space Tempering-based Monte Carlo samplers Tempering: the idea 0.5 densité cible 0.4 0.3 0.2 densité tempérée 0.1 0 -6 -4 -2 0 2 4 6 Learn a well fitted proposal mecanism by considering tempered versions π 1 /T of the target distribution π . ( T > 1 ) Hereafter, an example where tempering is plugged in a MCMC sampler.

Sampling multimodal densities in high dimensional sampling space Tempering-based Monte Carlo samplers The Equi-Energy sampler Example: Equi-Energy sampler (1/6) Kou, Zhou and Wong (2006) In the MCMC proposal mecanism, allow to pick a point from an auxiliary process designed to have better mixing properties. Auxiliary process With target π β Y 2 Y t-1 Y t Y 1 θ 1 θ 2 θ t-1 θ t The process of interest With target π X 1 X 2 X t-1 X t

Sampling multimodal densities in high dimensional sampling space - PowerPoint PPT Presentation

Sampling multimodal densities in high dimensional sampling space Sampling multimodal densities in high dimensional sampling space Gersende FORT LTCI, CNRS & Telecom ParisTech Paris, France Journ ees MAS Toulouse, Ao ut 2014

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Chest X-rays Basic to Intermediate Interpretation Relative Densities The images seen on a chest

Chest X-ray Path correlation Normal structures Densities Genesis of abnormal

Sampling Overview R toy sampling Non-probability sampling Probability Methods (AKA random)

Sampling Sediment and Sampling Sediment and Sampling Sediment and Porewater Sampling Sediment

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Multimodal Corridor Planning & Engineering Analysis Project A1A MULTIMODAL CORRIDOR PLANNING

MULTIMODAL OPTIMIZATION MIKE PREUSS. Multimodal Optimization 1 2014-09-14 Mike Preuss

Newfound Water Quality Sampling: In Lake Sampling 8 Historic Sampling locations

Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Sampling

Overview of Sampling Topics (Shannon) sampling theorem Impulse-train sampling

AES-Based Authenticated Encryption Modes in Parallel High-Performance Software Andrey Bogdanov

Control Charts for x and R Subsequent use of the charts The next 20 samples are added to the

Slice sampling Dr. Jarad Niemi STAT 615 - Iowa State University November 14, 2017 Jarad Niemi

Revisiting Benchmarking Methodology for Interconnect Devices Daniel Raumer, Sebastian Gallemller,

Lecture 6: Non-Parametric Methods Parzen Estimation Dr. Chengjiang Long Computer Vision

Microeconometrics Blundell Lecture 1 Overview and Binary Response Models Richard Blundell

Binary Choice Matthieu de Lapparent matthieu.delapparent@epfl.ch Transport and Mobility

Binary choice 3.2 Apply the model on data Michel Bierlaire Solution of the practice quiz.

Sampling multimodal densities in high dimensional sampling space - PowerPoint PPT Presentation

Sampling multimodal densities in high dimensional sampling space Sampling multimodal densities in high dimensional sampling space Gersende FORT LTCI, CNRS & Telecom ParisTech Paris, France Journ ees MAS Toulouse, Ao ut 2014

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Chest X-rays Basic to Intermediate Interpretation Relative Densities The images seen on a chest

Chest X-ray Path correlation Normal structures Densities Genesis of abnormal

Sampling Overview R toy sampling Non-probability sampling Probability Methods (AKA random)

Sampling Sediment and Sampling Sediment and Sampling Sediment and Porewater Sampling Sediment

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Multimodal Corridor Planning &amp; Engineering Analysis Project A1A MULTIMODAL CORRIDOR PLANNING

MULTIMODAL OPTIMIZATION MIKE PREUSS. Multimodal Optimization 1 2014-09-14 Mike Preuss

Newfound Water Quality Sampling: In Lake Sampling 8 Historic Sampling locations

Sampling Distributions Sampling Distribution of the Mean &amp; Hypothesis Testing Sampling

Overview of Sampling Topics (Shannon) sampling theorem Impulse-train sampling

AES-Based Authenticated Encryption Modes in Parallel High-Performance Software Andrey Bogdanov

Control Charts for x and R Subsequent use of the charts The next 20 samples are added to the

Slice sampling Dr. Jarad Niemi STAT 615 - Iowa State University November 14, 2017 Jarad Niemi

Revisiting Benchmarking Methodology for Interconnect Devices Daniel Raumer, Sebastian Gallemller,

Lecture 6: Non-Parametric Methods Parzen Estimation Dr. Chengjiang Long Computer Vision

Microeconometrics Blundell Lecture 1 Overview and Binary Response Models Richard Blundell

Binary Choice Matthieu de Lapparent matthieu.delapparent@epfl.ch Transport and Mobility

Binary choice 3.2 Apply the model on data Michel Bierlaire Solution of the practice quiz.

Multimodal Corridor Planning & Engineering Analysis Project A1A MULTIMODAL CORRIDOR PLANNING

Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Sampling