Variational regularisation for inverse problems with imperfect - PowerPoint PPT Presentation

Variational regularisation for inverse problems with imperfect forward operators and general noise models Leon Bungert 1 , Martin Burger 1 , Yury Korolev 2 and Carola Sch¨ onlieb 2 1 Department Mathematik, University of Erlangen-N¨ urnberg, Germany 2 Department of Applied Mathematics and Theoretical Physics, University of Cambridge, UK SIAM Conference on Imaging Science 15 July 2020

Layout Introduction Convergence Analysis Discrepancy Principle 1 / 19

Inverse Problems Inverse problem: Au = ¯ f , - A : U → F is the forward operator (linear in this talk), - ¯ f ∈ U exact (unattainable) data, - f δ noisy measurement with amount of noise characterised by δ > 0. 2 / 19

Variational Regularisation Variational regularisation: 1 α H ( Au | f δ ) + J ( u ) , min u ∈ U - H ( · | f δ ) is the fidelity function that models the noise (e.g., Kullback-Leibler divergence, L p -norm, Wasserstein distance), - J ( · ) is the regularisation term (e.g., Total Variation, ℓ 1 -norm), - α is the regularisation parameter. 3 / 19

Imperfect Forward Operators Forward operator A : U → F often - is not perfectly known (errors in geometry, coefficients of a PDE, convolution kernel), or - can only be evaluated approximately (simplified models, discretisation errors). Regularisation under operator errors: - Goncharskii, Leonov, Yagola (1973). A generalized discrepancy principle; - Hofmann (1986). Optimization aspects of the generalized discrepancy principle in regularization; - Neubauer, Scherzer (1990). Finite-dimensional approximation of Tikhonov regularized solutions of nonlin. ill-posed prob.; - P¨ oschl, Resmerita, Scherzer (2010). Discretization of variational regularization in Banach spaces; - Bleyer, Ramlau (2013). A double regularization approach for inverse problems with noisy data and inexact operator; - YK, Yagola (2013). Making use of a partial order in solving inverse problems; - YK (2014). Making use of a partial order in solving inverse problems: II; - YK, Lellmann (2018). Image reconstruction with imperfect forward models and applications in deblurring; - Burger, YK, Rasch (2019). Convergence rates and structure of solutions of inv. prob. with imperfect forward models; - Dong et al. (2019). Fixing nonconvergence of algebraic iterative reconstruction with an unmatched backprojector; Bayesian approximation error modelling: - Kaipio, Somersalo (2005). Statistical and computational inverse problems; - Arridge et al. (2006). Approximation errors and model reduction with an application in optical diffusion tomography; - Hansen et al. (2014). Accounting for imperfect forw. model. in geophys. inv. prob. - exemplified for crosshole tomography; - Calvetti et al. (2018). Iterative updating of model error for Bayesian inversion; - Rimpil¨ ainen et al. (2019). Improved EEG source localization with Bayes. uncert. modelling of unknown skull conductivity; - Riis, Dong, Hansen (2020). Computed tomography reconstr. with uncert. view angles by iter. updated model discrepancy.4 / 19

Learned Forward Operators Forward operator (or a correction to it) is learned from training pairs Au i = f i . ( u i , f i ) n s.t. i = 1 Learned forward operators: - Aspri, YK, Scherzer (2019). Data-driven regularisation by projection; - Bubba et al. (2019). Learning the invisible: A hybrid deep learning-shearlet framework for limited angle computed tomography; - Schwab, Antholzer, Haltmeier (2019). Deep null space learning for inverse problems: convergence analysis and rates; - Boink, Brune (2019). Learned SVD: solving inverse problems via hybrid autoencoding; - Lunz et al. (2020). On learned operator correction; - Nelsen, Stuart (2020). The random feature model for input-output maps between Banach spaces. 5 / 19

Contribution: combining general fidelities and operator errors Variational regularisation with exact operator 1 α H ( Au | f δ ) + J ( u ) . min u ∈ U Modelling operator error using partial order in a Banach lattice A l � A � A u (in a sense made precise later). Proposed: variational regularisation with interval operator 1 α H ( v | f δ ) + J ( u ) A l u � F v � F A u u . min s.t. u ∈ U v ∈ F - Convergence rates for a priori choices of α (depending on δ and � A u − A l � ); - Convergence rates for a posteriori choices of α (discrepancy principle; depending on δ , f δ , A l and A u ). Bungert, Burger, YK, Sch¨ onlieb (2020). Variational regularisation for inverse problems with imperfect forward operators and general noise models. arXiv:2005.14131 6 / 19

Layout Introduction Convergence Analysis Discrepancy Principle 7 / 19

Banach Lattices ◮ Vector space X with partial order � called an ordered vector space if x � y = ⇒ x + z � y + z ∀ x , y , z ∈ X , x � y = ⇒ λ x � λ y ∀ x , y ∈ X and λ ∈ R + . 8 / 19

Banach Lattices ◮ Vector space X with partial order � called an ordered vector space if x � y = ⇒ x + z � y + z ∀ x , y , z ∈ X , x � y = ⇒ λ x � λ y ∀ x , y ∈ X and λ ∈ R + . ◮ A vector lattice (or a Riesz space ) is an ordered vector space X with well defined suprema and infima ∀ x , y ∈ X ∃ x ∨ y ∈ X , x ∧ y ∈ X ; x ∨ 0 = x + , ( − x ) + = x − , x = x + − x − , | x | = x + + x − . 8 / 19

Banach Lattices ◮ Vector space X with partial order � called an ordered vector space if x � y = ⇒ x + z � y + z ∀ x , y , z ∈ X , x � y = ⇒ λ x � λ y ∀ x , y ∈ X and λ ∈ R + . ◮ A vector lattice (or a Riesz space ) is an ordered vector space X with well defined suprema and infima ∀ x , y ∈ X ∃ x ∨ y ∈ X , x ∧ y ∈ X ; x ∨ 0 = x + , ( − x ) + = x − , x = x + − x − , | x | = x + + x − . ◮ A Banach lattice is a vector lattice X with a monotone norm, i.e. ∀ x , y ∈ X | x | � | y | = ⇒ � x � � � y � . 8 / 19

Banach Lattices ◮ Vector space X with partial order � called an ordered vector space if x � y = ⇒ x + z � y + z ∀ x , y , z ∈ X , x � y = ⇒ λ x � λ y ∀ x , y ∈ X and λ ∈ R + . ◮ A vector lattice (or a Riesz space ) is an ordered vector space X with well defined suprema and infima ∀ x , y ∈ X ∃ x ∨ y ∈ X , x ∧ y ∈ X ; x ∨ 0 = x + , ( − x ) + = x − , x = x + − x − , | x | = x + + x − . ◮ A Banach lattice is a vector lattice X with a monotone norm, i.e. ∀ x , y ∈ X | x | � | y | = ⇒ � x � � � y � . ◮ Partial order for linear operators A , B : X → Y is defined as A � B if ∀ x � 0 in X = ⇒ Ax � Bx in Y . 8 / 19

Convergence of the Data and the Operator We consider sequences A l n , A u A l n � A � A u n : ∀ n , n � A u n − A l n � � η n → 0 as n → ∞ , H (¯ f n , δ n : f | f n ) � δ n ∀ n , δ n → 0 as n → ∞ , α n : α n → 0 as n → ∞ . Sequence of corresponding primal solutions ( u n , v n ) , n = 1 , ..., ∞ . 9 / 19

General Estimate Assumption (Source condition) There exists ω † ∈ F ∗ s.t. A ∗ ω † ∈ ∂ J ( u † J ) . Theorem (Bungert, Burger, YK, Sch¨ onlieb’20) Under standard assumptions the following estimate holds for the Bregann distance D J ( u n , u † J ) between the approximate solution u n and the J -minimising solution u † J J ) � δ n + 1 [ H ∗ ( α n ω † | f n ) − � α n ω † , ¯ D J ( u n , u † f � ] + C η n . α n α n 10 / 19

ϕ -divergences Definition Let ϕ : ( 0 , ∞ ) → R + be convex and ϕ ( 1 ) = 0. For ρ, ν ∈ P (Ω) with ρ ≪ ν the ϕ -divergence is defined as follows � d ρ � � d ϕ ( ρ | ν ) := ϕ d ν. d ν Ω We further assume that ϕ ∗ ( x ) = x + r ( x ) , where ϕ ∗ is the convex conjugate and r ( x ) / x → 0 as x → 0. - Kullback-Leibler divergence: ϕ ( x ) = x log( x ) + x − 1; - χ 2 divergence: ϕ ( x ) = ( x − 1 ) 2 ; - Squared Hellinger distance: ϕ ( x ) = ( √ x − 1 ) 2 ; - Total variation: ϕ ( x ) = | x − 1 | . 11 / 19

ϕ -divergences Theorem (Bungert, Burger, YK, Sch¨ onlieb’20) Under standard assumptions the following convergence rate holds � δ n + r ( α n ) � D J ( u n , u † J ) = O + η n . α n α n For an optimal choice of α we get - Kullback-Leibler divergence, χ 2 divergence, Squared Hellinger distance: � � 1 D J ( u n , u † 2 + η n J ) = O ( δ n ) ; - Total variation: D J ( u n , u † J ) = O ( δ n + η n ) (exact penalisation). 12 / 19

Strongly Coercive Fidelities Theorem (Bungert, Burger, YK, Sch¨ onlieb’20) Suppose that the fidelity function H satisfies 1 λ � v − f � λ F � H ( v | f ) for all v , f ∈ F, where λ � 1 . Then under standard assumptions and for an optimal choice of α the following rate holds � � 1 D J ( u n , u † J ) = O δ n + η n λ . - Powers of norms; - Wasserstein distances (coercive in the Kantorovich-Rubinstein norm). 13 / 19

Variational regularisation for inverse problems with imperfect - PowerPoint PPT Presentation

Variational regularisation for inverse problems with imperfect forward operators and general noise models Leon Bungert 1 , Martin Burger 1 , Yury Korolev 2 and Carola Sch onlieb 2 1 Department Mathematik, University of Erlangen-N urnberg,

Statistical Inverse Problems and abstract inverse problems examples Instrumental Variables

Dynamic Inverse Problems: Schmitt Efficient Algorithms and Approximate Inverse Problems

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Course on Inverse Problems Albert Tarantola Lesson VI: a) General Formulation of the Inverse

An Introduction to An Introduction to Variational Variational Methods for Graphical Models

Space-variant directional regularisation for image restoration problems Luca Calatroni CMAP,

fi Finnish Centre of Excellence in Inverse Problems Research p. 1/28 1 Inverse problem in

Inverse Problems Recovering x 0 R N from noisy observations y = x 0 + w R P Inverse

Deep Variational Inference FLARE Reading Group Presentation Wesley Tansey 9/28/2016 What is

Variational Inference for GPs: Presenters Group1: Stochastic variational inference. Slides 2 - 28

Rejection Sampling Variational Inference Karan Grewal CSC2547 / STA4273 Overview Variational

Inverse Kinematics Inverse Kinematics Inverse Kinematics Carnegie Carnegie Sebastian Grassia

Course on Inverse Problems Albert Tarantola First Lesson: Introduction to Inverse Problems The

Bayesian Inverse Problems and Uncertainty Quantification Hanne Kekkonen Centre for Mathematical

Bayesian inference in Inverse problems Bani Mallick bmallick@stat.tamu.edu Department of

C++ Concepts for C++ Concepts for Ill-posed Inverse Problems Ill-posed Inverse Problems Or: How

Profiling user belief in BI exploration for measuring subjective interestingness Alexandre

Obfuscation Using Distributional Features Bachelors Thesis Defense by Janek Bevendorff Date:

Bayesian estimation of sparse precision matrices Subhashis Ghoshal, North Carolina State

Infotheory for Statistics and Learning Lecture 1 Entropy Relative entropy Mutual

Optimum Source Resolvability Rate with Respect to f -Divergences Using the Smooth Rnyi Entropy

Approximate Relational Reasoning for Probabilistic Programs PhD Candidate: Federico Olmedo

Pure Exploration Stochastic Multi-armed Bandits Jian Li Institute for Interdisciplinary

Software Side-Channel Analysis: Attack Synthesis Lucas Bang Dissertation Defense Committee: