bayesian inference and convex geometry theory methods and
play

Bayesian inference and convex geometry: theory, methods, and - PowerPoint PPT Presentation

Bayesian inference and convex geometry: theory, methods, and algorithms. Dr. Marcelo Pereyra http://www.macs.hw.ac.uk/ mp71/ Maxwell Institute for Mathematical Sciences, Heriot-Watt University March 2019, Paris. M. Pereyra (MI HWU)


  1. Bayesian inference and convex geometry: theory, methods, and algorithms. Dr. Marcelo Pereyra http://www.macs.hw.ac.uk/ ✂ mp71/ Maxwell Institute for Mathematical Sciences, Heriot-Watt University March 2019, Paris. M. Pereyra (MI — HWU) Bayesian inference & convex geometry 0 / 47

  2. Outline Bayesian inference in imaging inverse problems 1 MAP estimation with Bayesian confidence regions 2 A decision-theoretic derivation of MAP estimation 3 4 Empirical Bayes MAP estimation with unknown regularisation parameters Conclusion 5 M. Pereyra (MI — HWU) Bayesian inference & convex geometry 1 / 47

  3. Imaging inverse problems We are interested in an unknown image x ❃ R d . We measure y , related to x by a statistical model p ❼ y ❙ x ➁ . The recovery of x from y is ill-posed or ill-conditioned, resulting in significant uncertainty about x . For example, in many imaging problems y � Ax ✔ w , for some operator A that is rank-deficient, and additive noise w . M. Pereyra (MI — HWU) Bayesian inference & convex geometry 2 / 47

  4. The Bayesian framework We use priors to reduce uncertainty and deliver accurate results. Given the prior p ❼ x ➁ , the posterior distribution of x given y p ❼ x ❙ y ➁ � p ❼ y ❙ x ➁ p ❼ x ➁⑦ p ❼ y ➁ models our knowledge about x after observing y . In this talk we consider that p ❼ x ❙ y ➁ is log-concave; i.e., p ❼ x ❙ y ➁ � exp ➌ ✏ φ ❼ x ➁➑⑦ Z , where φ ❼ x ➁ is a convex function and Z � ❘ exp ➌ ✏ φ ❼ x ➁➑ d x . M. Pereyra (MI — HWU) Bayesian inference & convex geometry 3 / 47

  5. Maximum-a-posteriori (MAP) estimation The predominant Bayesian approach in imaging is MAP estimation p ❼ x ❙ y ➁ , x MAP � argmax ˆ x ❃ R d (1) φ ❼ x ➁ , � argmin x ❃ R d efficiently computed by convex optimisation (Chambolle and Pock, 2016). However, MAP estimation has some limitations, e.g., 1 it provides little information about p ❼ x ❙ y ➁ , 2 it is not theoretically well understood (yet), 3 it struggles with unknown/partially unknown models. M. Pereyra (MI — HWU) Bayesian inference & convex geometry 4 / 47

  6. Illustrative example: astronomical image reconstruction Recover x ❃ R d from low-dimensional degraded observation y � M ❋ x ✔ w , where ❋ is the continuous Fourier transform, M ❃ C m ✕ d is a measurement operator and w is Gaussian noise. We use the model p ❼ x ❙ y ➁ ➀ exp ❽ ✏ ❨ y ✏ M ❋ x ❨ 2 ⑦ 2 σ 2 ✏ θ ❨ Ψ x ❨ 1 ➂ 1 R n ✔ ❼ x ➁ . (2) ˆ x MAP y Figure: Radio-interferometric image reconstruction of the W28 supernova . M. Pereyra (MI — HWU) Bayesian inference & convex geometry 5 / 47

  7. Outline Bayesian inference in imaging inverse problems 1 MAP estimation with Bayesian confidence regions 2 A decision-theoretic derivation of MAP estimation 3 4 Empirical Bayes MAP estimation with unknown regularisation parameters Conclusion 5 M. Pereyra (MI — HWU) Bayesian inference & convex geometry 6 / 47

  8. Posterior credible regions Where does the posterior probability mass of x lie? A set C α is a posterior credible region of confidence level ❼ 1 ✏ α ➁ % if P � x ❃ C α ❙ y ✆ � 1 ✏ α. The highest posterior density (HPD) region is decision-theoretically optimal (Robert, 2001) α � ➌ x ✂ φ ❼ x ➁ ❇ γ α ➑ C ❻ with γ α ❃ R chosen such that ❘ C ❻ α p ❼ x ❙ y ➁ d x � 1 ✏ α holds. We could estimate C ❻ α by numerical integration (e.g., MCMC sampling), but in high-dimensional log-concave settings this is not necessary because something beautiful happens... M. Pereyra (MI — HWU) Bayesian inference & convex geometry 7 / 47

  9. A concentration phenomenon... Figure: Convergence to “typical” set ➌ x ✂ log p ❼ x ❙ y ➁ ☎ E � log p ❼ x ❙ y ➁✆➑ . M. Pereyra (MI — HWU) Bayesian inference & convex geometry 8 / 47

  10. Proposed approximation of C ❻ α Theorem 2.1 (Pereyra (2016)) Suppose that the posterior p ❼ x ❙ y ➁ � exp ➌ ✏ φ ❼ x ➁➑⑦ Z is log-concave on R d . Then, for any α ❃ ❼ 4exp ❼ ✏ d ⑦ 3 ➁ , 1 ➁ , the HPD region C ❻ α is contained by ➸ C α � ➌ x ✂ φ ❼ x ➁ ❇ φ ❼ ˆ x MAP ➁ ✔ d τ α ✔ d ➁➑ , ˜ ➺ 16log ❼ 3 ⑦ α ➁ independent of p ❼ x ❙ y ➁ . with universal positive constant τ α � Remark 1 : ˜ C α is a conservative approximation of C ❻ α , i.e., x ➯ ˜ ✟ x ➯ C ❻ C α Ô α . Remark 2 : ˜ C α is available as a by-product in any convex inverse problem that is solved by MAP estimation! M. Pereyra (MI — HWU) Bayesian inference & convex geometry 9 / 47

  11. Approximation error bounds Is ˜ C α a reliable approximation of C ❻ α ? Theorem 2.2 (Finite-dimensional error bound (Pereyra, 2016)) ➸ γ α � φ ❼ ˆ x MAP ➁ ✔ d τ α ✔ d . If p ❼ x ❙ y ➁ is log-concave on R d , then Let ˜ 0 ❇ ˜ γ α ✏ γ α ❇ 1 ✔ η α d ✏ 1 ⑦ 2 , d ➺ ➺ 16log ❼ 3 ⑦ α ➁ ✔ 1 ⑦ α . with universal positive constant η α � C α is stable (as d becomes large, the error ❼ ˜ γ α ✏ γ α ➁⑦ d á 1). Remark 3: ˜ Remark 4: The lower and upper bounds are asymptotically tight w.r.t. d . M. Pereyra (MI — HWU) Bayesian inference & convex geometry 10 / 47

  12. Uncertainty visualisation in radio-interferometric imaging Astro-imaging experiment with redundant wavelet frame (Cai et al., 2017). Local credible intervals at scale 10 ✕ 10 pixels. dirty image ˆ x penML ❼ y ➁ ˆ x MAP approx. credible intervals “exact” intervals (MCMC) 3C2888 radio galaxy (size 256 ✕ 256 pixels, comp. time 1 . 8 secs.) dirty image ˆ x penML ❼ y ➁ ˆ x MAP approx. credible intervals “exact” intervals (MCMC) M31 radio galaxy (size 256 ✕ 256 pixels, comp. time 1 . 8 secs.) M. Pereyra (MI — HWU) Bayesian inference & convex geometry 11 / 47

  13. Hypothesis testing Bayesian hypothesis test for specific image structures (e.g., lesions) H 0 ✂ The structure of interest is ABSENT in the true image H 1 ✂ The structure of interest is PRESENT in the true image The null hypothesis H 0 is rejected with significance α if P ❼ H 0 ❙ y ➁ ❇ α. Theorem (Repetti et al., 2018) Let ❙ denote the region of R d associated with H 0 , containing all images without the structure of interest. Then ❙ ✾ ➬ ✟ P ❼ H 0 ❙ y ➁ ❇ α . ❈ α � ❣ Ô If in addition ❙ is convex, then checking ❙ ✾ ➬ ❈ α � ❣ is a convex problem x ❃ ➬ x , x ❃ R d ❨ ¯ x ✏ x ❨ 2 min s.t. ¯ ❈ α , x ❃ ❙ . 2 ¯ M. Pereyra (MI — HWU) Bayesian inference & convex geometry 12 / 47

  14. Uncertainty quantification in MRI imaging x ❃ ➬ x MAP ˆ ¯ ❈ 0 . 01 x ❃ ❙ x ❃ ➬ ˆ x MAP (zoom) ¯ ❈ 0 . 01 (zoom) x ❃ ❙ (zoom) MRI experiment: test images ¯ x � x, hence we fail to reject H 0 and conclude that there is little evidence to support the observed structure. M. Pereyra (MI — HWU) Bayesian inference & convex geometry 13 / 47

  15. Uncertainty quantification in MRI imaging x ❃ ❙ 0 x MAP ˆ ¯ x ❃ ❈ 0 . 01 ˆ x MAP (zoom) ¯ x ❃ ❈ 0 . 01 (zoom) x ❃ ❙ 0 (zoom) MRI experiment: test images ¯ x ① x, hence we reject H 0 and conclude that there is significant evidence in favour of the observed structure. M. Pereyra (MI — HWU) Bayesian inference & convex geometry 14 / 47

  16. Outline Bayesian inference in imaging inverse problems 1 MAP estimation with Bayesian confidence regions 2 A decision-theoretic derivation of MAP estimation 3 4 Empirical Bayes MAP estimation with unknown regularisation parameters Conclusion 5 M. Pereyra (MI — HWU) Bayesian inference & convex geometry 15 / 47

  17. Bayesian point estimators x ❃ R d Bayesian point estimators arise from the decision ”what point ˆ summarises x ❙ y best?”. The optimal decision under uncertainty is ❙ L ❼ u , x ➁ p ❼ x ❙ y ➁ d x E ➌ L ❼ u , x ➁❙ y ➑ � argmin x L � argmin ˆ u ❃ R d u ❃ R d where the loss L ❼ u , x ➁ measures the “dissimilarity” between u and x . Example: Euclidean setting L ❼ u , x ➁ � ❨ u ✏ x ❨ 2 and ˆ x MMSE � E ➌ x ❙ y ➑ . x L � ˆ General desiderata: 1 L ❼ u , x ➁ ❈ 0 , ➛ u , x ❃ R d , 2 L ❼ u , x ➁ � 0 ✡ ✟ u � x , 3 L strictly convex w.r.t. its first argument (for estimator uniqueness). M. Pereyra (MI — HWU) Bayesian inference & convex geometry 16 / 47

  18. Bayesian point estimators Does the convex geometry of p ❼ x ❙ y ➁ define an interesting loss L ❼ u , x ➁ ? We use differential geometry to relate the convexity of p ❼ x ❙ y ➁ , the geometry of the parameter space, and the loss L to perform estimation. M. Pereyra (MI — HWU) Bayesian inference & convex geometry 17 / 47

  19. Differential geometry A Riemannian manifold ▼ � ❼ R d , g ➁ , with metric g ✂ R d � ❙ d ✔✔ and global coordinate system x , is a vector space that is locally Euclidean. For any point x ❃ R d we have an Euclidean tangent space ❚ x R d with inner ➺ product ❵ u , x ❡ � u ➋ g ❼ x ➁ x and norm ❨ x ❨ � x ➋ g ❼ x ➁ x . This geometry is local and may vary smoothly from ❚ x R d to ❚ x ➐ R d following the affine connection Γ ❃ R d ✕ d ✕ d , given by Γ ij , k ❼ x ➁ � ∂ k g i , j ❼ x ➁ . M. Pereyra (MI — HWU) Bayesian inference & convex geometry 18 / 47

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend