AXDA : efficient sampling through variable splitting inspired - PowerPoint PPT Presentation

AXDA : efficient sampling through variable splitting inspired bayesian hierarchical models P. Chainais with Maxime Vono & Nicolas Dobigeon March 12th 2019 Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 1 / 47

Flight schedule 1 Motivations 2 Splitted Gibbs sampling (SP) 3 Splitted & Augmented Gibbs sampling (SPA) 4 Asymptotically exact data augmentation: AXDA Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 2 / 47

Outline 1 Motivations 2 Splitted Gibbs sampling (SP) 3 Splitted & Augmented Gibbs sampling (SPA) 4 Asymptotically exact data augmentation: AXDA Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 3 / 47

Motivations: applications in image processing Image deblurring Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 4 / 47

Motivations: applications in image processing Image inpainting Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 4 / 47

Motivations: applications in image processing Confidence intervals 80 70 60 50 40 30 Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 4 / 47

Motivations I have a dream... ◮ solve complex ill-posed inverse problems ◮ big data in large dimensions ◮ excellent performances ◮ fast inference algorithms ◮ credibility intervals with maybe some additional options such as: ◮ parallel distributed computing ◮ privacy preserving Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 5 / 47

Motivations I have a dream... ◮ solve complex ill-posed inverse problems ◮ big data in large dimensions ◮ excellent performances ◮ fast inference algorithms ◮ credibility intervals with maybe some additional options such as: ◮ parallel distributed computing ◮ privacy preserving = ⇒ Bayesian approach + MCMC method! Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 5 / 47

Motivations The optimization-based approach Inverse problems & optimization f ( x ) = f 1 ( x | y ) + f 2 ( x ) = define a cost function : where f 2 is typically ◮ convex (or not) ◮ not differentiable ⇒ proximal operators ◮ a sum of various penalties Solution: proximal operators Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 6 / 47

Motivations The optimization-based approach Inverse problems & optimization f ( x ) = f 1 ( x | y ) + f 2 ( x ) = define a cost function : where f 2 is typically ◮ convex (or not) ◮ not differentiable ⇒ proximal operators ◮ a sum of various penalties Solution: proximal operators and splitting techniques arg min f 1 ( x ) + f 2 ( z ) such that x = z x maybe relaxed to (simplified version of ADMM ) f 1 ( x ) + f 2 ( z ) + α 2 + u T ( x − z ) 2 � x − z � 2 arg min x Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 6 / 47

Motivations The Bayesian approach posterior ∝ likelihood( f 1) × prior( f 2) Inverse problems & Bayes = define a posterior distribution p ( x | y ) = p 1 ( x | y ) · p 2 ( x ) where p 2 is typically ◮ log-concave (or not) ↔ f 2 convex ◮ conjugate ⇒ easy sampling/inference ◮ a combination of various prior Solution: MCMC methods and Gibbs sampling x i ∼ p ( x i | x \ i ) ∀ 1 ≤ i ≤ d Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 7 / 47

Motivations The Bayesian approach posterior ∝ likelihood( f 1) × prior( f 2) Inverse problems & Bayes = define a posterior distribution p ( x | y ) = p 1 ( x | y ) · p 2 ( x ) where p 2 is typically ◮ log-concave (or not) ↔ f 2 convex ◮ conjugate ⇒ easy sampling/inference ◮ a combination of various prior Solution: MCMC methods and Gibbs sampling x i ∼ p ( x i | x \ i ) ∀ 1 ≤ i ≤ d Can we adapt splitting and augmentation from optimization ? � 1 1 � 2 ρ 2 � u − x + z � 2 2 α 2 � u � 2 π ρ ( x , z , u ) ∝ exp − f 1 ( x ) − f 2 ( z ) − 2 − Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 7 / 47

Motivations The Bayesian approach posterior ∝ likelihood( f 1) × prior( f 2) Inverse problems & Bayes = define a posterior distribution p ( x | y ) = p 1 ( x | y ) · p 2 ( x ) Computational motivations: difficult sampling ◮ non-conjugate priors [conj. priors ⇒ easy inference] ◮ rich models: complicated prior distributions ◮ big datasets: expensive likelihood computation Strategy: ❉✐✈✐❞❡✲❚♦✲❈♦♥q✉❡r = ⇒ splitting (SP) and augmentation (SPA) Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 8 / 47

Outline 1 Motivations 2 Splitted Gibbs sampling (SP) 3 Splitted & Augmented Gibbs sampling (SPA) 4 Asymptotically exact data augmentation: AXDA Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 9 / 47

Splitted Gibbs sampling (SP) π ( x ) ∝ exp [ − f 1 ( x ) − f 2 ( x )] ⇓ � 1 � 2 ρ 2 � x − z � 2 π ρ ( x , z ) ∝ exp − f 1 ( x ) − f 2 ( z ) − 2 f 2 z θ x θ z φ ρ f 2 ρ x x f 1 f 1 y y Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 10 / 47

Splitted Gibbs sampling (SP) π ( x ) ∝ exp [ − f 1 ( x ) − f 2 ( x )] ⇓ π ρ ( x , z ) ∝ exp [ − f 1 ( x ) − f 2 ( z ) − φ ρ ( x , z )] f 2 z θ x θ z φ ρ f 2 ρ x x f 1 f 1 y y Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 10 / 47

Splitted Gibbs sampling (SP): Theorem 1 Consider the marginal of x under π ρ : � � p ρ ( x ) = R d π ρ ( x , z )d z ∝ R d exp [ − f 1 ( x ) − f 2 ( z ) − φ ρ ( x , z )] d z . Theorem Assume that in the limiting case ρ → 0 , φ ρ is such that exp ( − φ ρ ( x , z )) R d exp ( − φ ρ ( x , z )) d x − ρ → 0 δ x ( z ) − − → � Then p ρ coincides with π when ρ → 0 , that is � p ρ − π � TV − ρ → 0 0 − − → Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 11 / 47

Splitted Gibbs sampling (SP): marginal distributions Full conditional distributions under the split distribution π ρ : π ρ ( x | z ) ∝ exp ( − f 1 ( x ) − φ ρ ( x , z )) π ρ ( z | x ) ∝ exp ( − f 2 ( z ) − φ ρ ( x , z )) . Note that f 1 and f 2 are now separated in 2 distinct distributions Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 12 / 47

Splitted Gibbs sampling (SP): marginal distributions Full conditional distributions under the split distribution π ρ : � 1 � 2 ρ 2 � x − z � 2 π ρ ( x | z ) ∝ exp − f 1 ( x ) − 2 � 1 � 2 ρ 2 � x − z � 2 π ρ ( z | x ) ∝ exp − f 2 ( z ) − . 2 Note that f 1 and f 2 are now separated in 2 distinct distributions State of the art sampling methods: ◮ P-MYULA = proximal MCMC, (Pereyra 2016; Durmus et al. 2018) ◮ Fourier or Aux-V1 or E-PO for Gaussian variables ◮ ... Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 12 / 47

Splitted Gibbs sampling (SP): inverse problems Linear Gaussian inverse problems y = Px + n , � 0 d , σ 2 I d � where P = damaging operator and n ∼ N = noise. 1  2 σ 2 � y − Px � 2 ∀ x ∈ R d , f 1 ( x ) =  2 f 2 ( x ) = τψ ( x ) , τ > 0 .  Then the SP conditional distributions are: µ x , Q x − 1 � � π ρ ( x | z ) = N � 1 � 2 ρ 2 � z − x � 2 π ρ ( z | x ) ∝ exp − τψ ( z ) − , 2 Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 13 / 47

Splitted Gibbs sampling (SP): efficient sampling Linear Gaussian inverse problems µ x , Q x − 1 � � π ρ ( x | z ) = N � � 1 2 ρ 2 � z − x � 2 π ρ ( z | x ) ∝ exp − τψ ( z ) − , 2 Examples: ◮ Convex non-smooth ψ ( x ) = TV , ℓ 1 sparsity... ⇒ proximal MCMC ◮ Tikhonov regularization ψ ( z ) = � Qz � 2 2 ⇒ Gaussian variables (e.g. P or Q diagonalizable in Fourier → E-PO) Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 14 / 47

Splitted Gibbs sampling (SP): TV deblurring Linear Gaussian inverse problems Posterior distribution � � − 1 2 σ 2 � Px − y � 2 p ( x | y ) ∝ exp 2 − β TV( x ) where P = damaging operator (blur, binary mask...) and � � � TV( x ) = � ( ∇ x ) i,j � � � 2 1 ≤ i,j ≤ N Direct sampling is challenging 1 generally high dimension of the image, 2 non-conjugacy of the TV-based prior, 3 non-differentiability of g ( � = Hamiltonian Monte Carlo algorithms) Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 15 / 47

Splitted Gibbs sampling (SP): TV deblurring Linear Gaussian inverse problems Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 16 / 47

Splitted Gibbs sampling (SP): TV deblurring Linear Gaussian inverse problems 65 60 55 50 45 40 35 30 25 20 Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 16 / 47

Splitted Gibbs sampling (SP): TV deblurring Linear Gaussian inverse problems SALSA FISTA SGS P-MYULA time (s) 1 10 470 3600 time ( × var. split.) 1 10 1 7.7 ∼ 10 4 10 5 nb. iterations 22 214 SNR (dB) 17.87 17.86 18.36 17.97 Rk : ρ 2 = 9 Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 17 / 47

AXDA : efficient sampling through variable splitting inspired - PowerPoint PPT Presentation

AXDA : efficient sampling through variable splitting inspired bayesian hierarchical models P. Chainais with Maxime Vono & Nicolas Dobigeon March 12th 2019 Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 1 / 47 Flight schedule 1

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Introduction 1 Splitting unpack 2 Splitting pack 3 Reduction 4 Advanced technicalities 5

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

With Splitting Steepest Descent Splitting yields adaptive net structure optimization Questions

Splitting and Propositional Variables in Resolution Theorem Provers Splitting and Propositional

Sampling Overview R toy sampling Non-probability sampling Probability Methods (AKA random)

Sampling Sediment and Sampling Sediment and Sampling Sediment and Porewater Sampling Sediment

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Newfound Water Quality Sampling: In Lake Sampling 8 Historic Sampling locations

Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Sampling

Overview of Sampling Topics (Shannon) sampling theorem Impulse-train sampling

Evaluation and Evaluation and Design of Water- - Design of Water Splitting Cycles Splitting

Formation of planetesimals in collapsing particle clouds Karl Wahlberg Jansson Supervisor:

Distributed optimization Mikael Johansson KTH Stockholm - Sweden Hycon2 PhD School,

Stephen Foreshew-Cain @s_foreshew_cain Hello Copenhagen @s_foreshew_cain @s_foreshew_cain

Welcome AnintroductiontoYear6. 1 MeettheTeam 6M

Parameter Estimation for Quantum Information Christopher Granade www.cgranade.com

(1) Otto-von-Guericke-Universitt Magdeburg (2) TUD Technische Universitt Darmstadt FOSD

Collision Avoidance in Micro Aerial Vehicles Motion Planning 12 November 2018 Glareh Mir

Mary Hall October 24, 2017 Postdoctoral Researcher Opening: ~Jan. 2018 >10 Open Faculty

AXDA : efficient sampling through variable splitting inspired - PowerPoint PPT Presentation

AXDA : efficient sampling through variable splitting inspired bayesian hierarchical models P. Chainais with Maxime Vono & Nicolas Dobigeon March 12th 2019 Centrale Lille - CRIStAL Pierre Chainais March 12th 2019 1 / 47 Flight schedule 1

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Introduction 1 Splitting unpack 2 Splitting pack 3 Reduction 4 Advanced technicalities 5

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

With Splitting Steepest Descent Splitting yields adaptive net structure optimization Questions

Splitting and Propositional Variables in Resolution Theorem Provers Splitting and Propositional

Sampling Overview R toy sampling Non-probability sampling Probability Methods (AKA random)

Sampling Sediment and Sampling Sediment and Sampling Sediment and Porewater Sampling Sediment

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Newfound Water Quality Sampling: In Lake Sampling 8 Historic Sampling locations

Sampling Distributions Sampling Distribution of the Mean &amp; Hypothesis Testing Sampling

Overview of Sampling Topics (Shannon) sampling theorem Impulse-train sampling

Evaluation and Evaluation and Design of Water- - Design of Water Splitting Cycles Splitting

Formation of planetesimals in collapsing particle clouds Karl Wahlberg Jansson Supervisor:

Distributed optimization Mikael Johansson KTH Stockholm - Sweden Hycon2 PhD School,

Stephen Foreshew-Cain @s_foreshew_cain Hello Copenhagen @s_foreshew_cain @s_foreshew_cain

Welcome AnintroductiontoYear6. 1 MeettheTeam 6M

Parameter Estimation for Quantum Information Christopher Granade www.cgranade.com

(1) Otto-von-Guericke-Universitt Magdeburg (2) TUD Technische Universitt Darmstadt FOSD

Collision Avoidance in Micro Aerial Vehicles Motion Planning 12 November 2018 Glareh Mir

Mary Hall October 24, 2017 Postdoctoral Researcher Opening: ~Jan. 2018 &gt;10 Open Faculty

Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Sampling

Mary Hall October 24, 2017 Postdoctoral Researcher Opening: ~Jan. 2018 >10 Open Faculty