Statistical Properties of the Regularized Least Squares Functional - PowerPoint PPT Presentation

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Statistical Properties of the Regularized Least Squares Functional and a hybrid LSQR Newton method for Finding the Regularization Parameter: Application in Image Deblurring and Signal Restoration Rosemary Renaut Midwest Conference on Mathematical Methods for Images and Surfaces April 18, 2009 National Science Foundation: Division of Computational Mathematics 1 / 34

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Outline Motivation 1 Least Squares Problems 2 Statistical Results for Least Squares 3 Implications of Statistical Results for Regularized Least Squares 4 Newton algorithm 5 Algorithm with LSQR (Paige and Saunders) 6 Results 7 Conclusions and Future Work 8 National Science Foundation: Division of Computational Mathematics 2 / 34

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Signal/Image Restoration: � Integral Model of Signal Degradation b ( t ) = K ( t , s ) x ( s ) ds K ( t , s ) describes blur of the signal. Convolutional model: invariant K ( t , s ) = K ( t − s ) is Point Spread Function (PSF). Typically sampling includes noise e ( t ) , model is � b ( t ) = K ( t − s ) x ( s ) ds + e ( t ) Discrete model: given discrete samples b, find samples x of x Let A discretize K , assume known, model is given by b = A x + e . Na¨ ıvely invert the system to find x ! National Science Foundation: Division of Computational Mathematics 3 / 34

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Example 1-D Original and Blurred Noisy Signal Original signal x . Blurred and noisy signal b , Gaussian PSF. National Science Foundation: Division of Computational Mathematics 4 / 34

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton The Solution: Regularization is needed Na¨ ıve Solution A Regularized Solution National Science Foundation: Division of Computational Mathematics 5 / 34

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Least Squares for A x = b: A Quick Review Consider discrete systems: A ∈ R m × n , b ∈ R m , x ∈ R n A x = b + e , Classical Approach Linear Least Squares x || A x − b || 2 x LS = arg min 2 Difficulty x LS is sensitive to changes in the right hand side b when A is ill-conditioned. For convolutional models system is numerically ill-posed . National Science Foundation: Division of Computational Mathematics 6 / 34

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Introduce Regularization to Pick a Solution Weighted Fidelity with Regularization • Regularize x {� b − A x � 2 W b + λ 2 R ( x ) } , x RLS = arg min Weighting matrix W b • R ( x ) is a regularization term • λ is a regularization parameter which is unknown. Solution x RLS ( λ ) depends on λ . depends on regularization operator R depends on the weighting matrix W b National Science Foundation: Division of Computational Mathematics 7 / 34

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Generalized Tikhonov Regularization With Weighting x = argmin J ( x ) = argmin {� A x − b � 2 W b + λ 2 � D ( x − x 0 ) � 2 } . ˆ (1) D is a suitable operator, often derivative approximation. Assume N ( A ) ∩ N ( D ) = ∅ x 0 is a reference solution, often x 0 = 0. Given multiple measurements of data: Usually error in b , e is an m − vector of random measurement errors with mean 0 and positive definite covariance matrix C b = E ( ee T ) . For uncorrelated measurements C b is diagonal matrix of standard deviations of the errors. (Colored noise) For white noise C b = σ 2 I . Weighting by W b = C b − 1 in data fit term, theoretically, ˜ e are uncorrelated. Question Given D , W b how do we find λ ? National Science Foundation: Division of Computational Mathematics 8 / 34

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Example: solution for Increasing λ , D = I . National Science Foundation: Division of Computational Mathematics 9 / 34

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Choice of λ crucial Different algorithms yield different solutions. Examples: Discrepancy Principle Generalized Cross Validation (GCV) L-Curve Unbiased Predictive Risk (UPRE) General Difficulties Expensive (GCV, L, UPRE) Not necessarily unique solution (GCV) Oversmoothing (Discrepancy) No kink in the L-curve A new statistical approach χ 2 result National Science Foundation: Division of Computational Mathematics 10 / 34

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Background: Statistics of the Least Squares Problem Theorem (Rao73: First Fundamental Theorem) Let r be the rank of A and for b ∼ N ( A x , σ 2 b I ) , (errors in measurements are normally distributed with mean 0 and covariance σ 2 b I), then x � A x − b � 2 ∼ σ 2 b χ 2 ( m − r ) . J = min J follows a χ 2 distribution with m − r degrees of freedom: Basically the Discrepancy Principle Corollary (Weighted Least Squares) For b ∼ N ( A x , C b ) , and W b = C b − 1 then x � A x − b � 2 W b ∼ χ 2 ( m − r ) . J = min National Science Foundation: Division of Computational Mathematics 11 / 34

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Extension: Statistics of the Regularized Least Squares Problem Theorem: χ 2 distribution of the regularized functional (Renaut/Mead 2008) x = argmin J D ( x ) = argmin {� A x − b � 2 W b + � ( x − x 0 ) � 2 W D = D T W x D . ˆ W D } , (2) Assume W b and W x are symmetric positive definite. Problem is uniquely solvable N ( A ) ∩ N ( D ) � = 0. Moore-Penrose generalized inverse of W D is C D Statistics: ( b − A x ) = e ∼ N ( 0 , C b ) , ( x − x 0 ) = f ∼ N ( 0 , C D ) , x 0 is the mean vector of the model parameters. Then J D ∼ χ 2 ( m + p − n ) National Science Foundation: Division of Computational Mathematics 12 / 34

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Key Aspects of the Proof I: The Functional J Algebraic Simplifications: Rewrite functional as quadratic form Regularized solution given in terms of resolution matrix R ( W D ) x 0 + ( A T W b A + D T W x D ) − 1 A T W b r , ˆ x = (3) x 0 + R ( W D ) W b 1 / 2 r , = r = b − A x 0 = x 0 + y ( W D ) . (4) ( A T W b A + D T W x D ) − 1 A T W b 1 / 2 R ( W D ) = (5) Functional is given in terms of influence matrix A ( W D ) W b 1 / 2 AR ( W D ) A ( W D ) = (6) r T W b 1 / 2 ( I m − A ( W D )) W b 1 / 2 r , r = W b 1 / 2 r (7) J D (ˆ ˜ x ) = let r T ( I m − A ( W D ))˜ = ˜ r . A Quadratic Form (8) National Science Foundation: Division of Computational Mathematics 13 / 34

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Key Aspects of the Proof II : Properties of a Quadratic Form χ 2 distribution of Quadratic Forms x T P x for normal variables (Fisher- Cochran Theorem) Components x i are independent normal variables x i ∼ N ( 0 , 1 ) , i = 1 : n . A necessary and sufficient condition that x T P x has a central χ 2 distribution is that P is idempotent , P 2 = P . In which case the degrees of freedom of χ 2 is rank( P ) = trace( P ) = n . . When the means of x i are µ i � = 0, x T P x has a non-central χ 2 distribution, with non-centrality parameter c = µ T P µ A χ 2 random variable with n degrees of freedom and centrality parameter c has mean n + c and variance 2 ( n + 2 c ) . National Science Foundation: Division of Computational Mathematics 14 / 34

Statistical Properties of the Regularized Least Squares Functional - PowerPoint PPT Presentation

Motivation Least Squares Problems Statistical Results for Least Squares Implications of Statistical Results for Regularized Least Squares Newton Statistical Properties of the Regularized Least Squares Functional and a hybrid LSQR Newton method

The Chi-squared Distribution of the Regularized Least Squares Functional for Regularization

Practical Least-Squares for Computer Graphics Siggraph Course 11 Siggraph Course 11 Practical

Model Selection and Fast Rates for Regularized Least-Squares Andrea Caponnetto 1 Plan

Regularized Least Squares Charlie Frogner 1 MIT 2011 1 Slides mostly stolen from Ryan Rifkin

Regularized Least Squares Charlie Frogner 1 MIT 2012 1 Slides mostly stolen from Ryan Rifkin

Least Mean Squares Regression Machine Learning 1 Least Squares Method for regression

ECE 516: Adaptive Digital Filters Lecture 13 (Recursive Least-Squares) Mojtaba Soltanalian 2

The Mathemagic of Magic Squares History of Magic Squares Mathematics and Magic Squares

Solving Regularized Total Least Squares Problems Based on Eigensolvers Heinrich Voss

Statistical Geometry Processing Winter Semester 2011/2012 Least-Squares Least-Squares Fitting

9. Equality constraints and tradeoffs More least squares Example: moving average model

8. Least squares Review of linear equations Least squares Example: curve-fitting

Linear Least Squares I Steve Marschner Cornell CS 322 Cornell CS 322 Linear Least Squares I 1

Moving Least Squares Outline The Approximation Power of Moving Least- Squares D. Levin

Non linear Least Squares Lectures for PHD course on Numerical optimization Enrico Bertolazzi

Geometry of Least Squares 2 Least squares from the

Investor Presentation June 2014 Forward-Looking and Cautionary Statements This presentation

NASA EEE Parts and NASA Electronic Parts and Packaging (NEPP) Program Update 2018 Kenneth A.

An Adaptive Design for Survival Studies with Subgroup Selection based on Predictive Biomarkers

A Georgia School of Excellence CHS Counselors Test Your Knowledge Common Misconceptions (Pre

On the Minimum Induced Drag of Wings Albion H. Bowers NASA Dryden Flight Research

The Muckenhoupt-type estimations for the best constants in multidimensional modular inequalities

Transitioning to the New Minimum Data Set (MDS) Assessments September 25-26, 2019 Answer Key for

Click Hello ! 2 The global body for professional accountants Click REVIEW OF PAST EXAMS 3