The Total Least Squares Problem with Multiple Right-Hand Sides A X - PowerPoint PPT Presentation

The Total Least Squares Problem with Multiple Right-Hand Sides A X ≈ B Martin Pleˇ singer in collaboration with Iveta Hnˇ etynkov´ a, Diana Maria Sima, Zdenˇ ek Strakoˇ s, and Sabine Van Huffel FP & FM TU Liberec, ICS AS CR, MFF CU Prague SAM ETH Zurich, ESAT KU Leuven, ... PANM 16 — June 7th, 2012 0

Outline Motivation I. TLS with single right-hand side I.1 Full column rank case I.2 An example of rank deficient case I.3 Core problem reduction II. TLS with multiple right-hand sides II.1 Problem formulation II.2 Analysis by Van Huffel & Vandewalle II.3 Complete classification II.4 Core problem—SVD-based reduction II.5 Generalized Golub-Kahan algorithm II.6 TLS solution of a core problem Summary 1

Motivation (single right-hand side case) Consider a simple problem A = [ t 1 , t 2 , . . . , t m ] T , b = [ ℓ 1 , ℓ 2 , . . . , ℓ m ] T x = v, Ax ≈ b, where ℓ j are distances measured in m times t j and v is an unknown (constant) velocity. Since the measured distances (and also times) contain errors , the problem is not compatible b �∈ R ( A ) , and does not have a solution in the classical sense. The goal to approximate v using some minimization technique, e.g. (ordinary) least squares . 2

Ordinary Least Squares Let Ax ≈ b , where A ∈ R m × n , x ∈ R n , b ∈ R m , then x LS ≡ arg min x ∈ R n � b − Ax � the vector minimizing the residual, is called the least squares (LS) solution . Alternatively, g ∈ R m � g � min s.t. Ax = b + g, (or b + g ∈ R ( A )) . If the LS solution is not unique, then the minimal one is considered, x LS = A † b. 3

LS leads to minimization of sum of squares of “vertical” distances: 16 ℓ s [m] 14 12 10 8 6 4 2 t [s] 0 0 2 4 6 8 10 12 14 16 4

The errors may be in both ℓ j s as well as in t j s, let us try, e.g.: 16 ℓ s [m] 14 12 10 8 6 4 2 t [s] 0 0 2 4 6 8 10 12 14 16 5

Different Least Squares Approaches Except of the ordinary least squares (LS, OLS) also called linear regression , min s.t. Ax = b + g, (or b + g ∈ R ( A )) , g ∈ R m � g � one can use the data least squares (DLS) , min s.t. ( A + E ) x = b, (or b ∈ R ( A + E )) , E ∈ R m × n � E � F or the total least squares (TLS) also called orthogonal regression , or errors-in-variables (EIV) modeling , g ∈ R m , E ∈ R m × n � [ g, E ] � F min s.t. ( A + E ) x = b + g, (or b + g ∈ R ( A + E )) . 6

Scaling All three approaches can be unified using the scaled TLS (ScTLS) g ∈ R m , E ∈ R m × n � [ gγ, E ] � F min s.t. ( A + E ) x = b + g, where γ ∈ (0 , ∞ ). The ScTLS problem: with γ = 1 leads to TLS, for γ − → 0 tends to ordinary LS, for γ − → ∞ tends to DLS. See [Paige, Strakoˇ s, 2002a, 2002b]. Scaling of individual columns of A by S = diag( s 1 , . . . , s n ), s i > 0, and weighting of individual rows by W = diag( w 1 , . . . , w m ), w j > 0, is also possible. Instead of Ax ≈ b we solve ( WAS ) y ≈ Wb with x = Sy ; see [Golub, Van Loan, 1980]. 7

I. TLS problem (Single right-hand side case) I.1 Full column rank case Consider a linear approximation problem � � � � − 1 A x ≈ b , or equivalently b A ≈ 0 , x x A ∈ R m × n , b ∈ R m , A T b � = 0 , where b �∈ R ( A ) , rank( A ) = n , thus m ≥ n + 1 . Total least square ( TLS ) problem: � � � � � � min subject to ( A + G ) x = b + e . e G � � F e , G (If there is more than one solution we look for the minimal one.) 8

Thus we look for a correction [ g | E ] matrix such that [ b + g | A + E ] is: 1) rank deficient, and 2) its null-space contains vector with nonzero first component, � � � � − 1 b + g A + E = 0 . x The minimal rank-reducing correction can be obtained by the singular value decomposition (SVD) of the matrix [ b | A ]. 9

The minimal rank reducing correction Consider the singular value decomposition (SVD), � n +1 [ b | A ] = U Σ V T = j =1 u j σ j v T j . Then [ g | E ] = − u n +1 σ n +1 v T n +1 , � [ g | E ] � F = σ n +1 is the minimal rank reducing correction. Since V T = V − 1 [ b + g | A + E ] v n +1 = 0 . Denote v n +1 = [ ν , w T ] T . If ν � = 0, then � � x TLS = − 1 − 1 [ b + g | A + E ] = 0 , and ν w . − 1 ν w 10

Uniqueness If σ n = σ n +1 , then the smallest singular value is not unique. Let q + 1 be the multiplicity of σ n +1 , i.e., q ≥ 0 and σ n − q > σ n − q +1 = . . . = σ n +1 . Consider the partitioning n − q q + 1 � �� ⎡ ⎤ V ( q ) V ( q ) } 1 11 12 ⎣ ⎦ V ≡ } n . V ( q ) V ( q ) 21 22 σ n − q , V ( q ) 11 , V ( q ) If σ 1 = σ n +1 , then are nonexistent. 21 11

Classical results V ( q ) If � = 0 with q = 0 , then 12 the TLS problem has the unique (basic) solution . • V ( q ) If � = 0 with q > n , then 12 • the TLS problem has infinitely many solutions, the goal is • to find the minimum norm solution . V ( q ) If = 0 , then 12 • the TLS problem does not have a solution (but the TLS • concept can be extended to the so called nongeneric solution ; • the classical TLS algorithm ) See [Golub, Van Loan, 1980], [Van Huffel, Vandewalle, 1991]. V ( q ) ∈ R 1 × ( q +1) . Recall that here 12 12

I.2 An example of rank deficient case Consider the incompatible problem with rank deficient system matrix � � � � � � 1 0 ξ 1 1 = . 0 0 ξ 2 1 With the correction � � � � 0 0 0 g E = , ε � = 0 0 0 ε the problem becomes compatible � � � � � � 1 0 1 ξ 1 = . 0 1 ε ξ 2 Obviously � [ g | E ] � F = ε , thus there is no minimal correction! The problem with rank deficient system matrix does not have a solution ; see [Van Huffel, Vandewalle, 1991]. 13

I.3 Core problem reduction Consider the TLS formulation min � [ g | E ] � F s.t. ( A + E ) x = b + g and an orthogonal transformation x ≡ ( P T AQ )( Q T x ) ≈ ( P T b ) ≡ � � A � b, where P T = P − 1 , Q T = Q − 1 . Because � �� 1 0 � P T [ g | E ] � � = � [ P T g | P T EQ ] � F ≡ � [ � g | � � [ g | E ] � F = E ] � F � � 0 Q � F the TLS formulation is orthogonally invariant. 14

Let P , Q , be orthogonal matrices such that � � � � � P T � = P T � � 1 0 b 1 A 11 0 = b A b A Q . 0 Q 0 0 A 22 � �� b A The original problem is decomposed into independent subproblems A 11 x 1 ≈ b 1 , and A 22 x 2 ≈ 0 , where the second has a solution x 2 = 0, and � � � � x 1 x 1 x = Q � x = Q = Q . 0 x 2 The solution of the original problem is fully determined by the solution of the first subproblem. 15

Theorem (Core problem) For any A , b , A T b � = 0, b �∈ R ( A ) there exist orthogonal matrices P , Q , such that � � � � � P T � = P T � � 0 1 0 b 1 A 11 b A b A Q = , 0 Q 0 0 A 22 and: (P1) A 11 is of full column rank. (P2) A 11 has simple singular values, and b 1 has nonzero projections onto all (one-dimensional) left singular vector subspaces of A 11 , • [ b 1 | A 11 ] is of full row rank. • The subproblem A 11 x 1 ≈ b 1 called core problem has minimal dimensions. • The subproblem A 11 x 1 ≈ b 1 has the unique TLS solution . 16

Let x 1 be the unique TLS solution of A 11 x 1 ≈ b 1 . If the original problem has a TLS solution ( V ( q ) 12 � = 0), then the vector � � x 1 (1) x ≡ Q 0 is identical to the minimum norm solution (which is uniqe for q = 0). If the original problem does not have a TLS solution ( V ( q ) 12 = 0), then (1) is identical to the (minimum norm) nongeneric solution (intro- duced in [Van Huffel, Vandewalle, 1991]). See [Paige, Strakoˇ s, 2006] (also [Hnˇ etynkov´ a, Strakoˇ s, 2007], or [Hnˇ etynkov´ a, P., Sima, Strakoˇ s, Van Huffel, 2011]). 17

Construction of the core problem The core problem can be obtained in three steps: Step 1: Decomposition of the system matrix A Step 2: Transformation of the modified right-hand side Step 3: Final permutation 18

Step 1: Decomposition of the system matrix A Consider the singular value decomposition (SVD) of A A = U ′ Σ ′ V ′ T , k I m k , 0 ) ∈ R m × n , Σ = diag ( ς ′ 1 I m 1 , ς ′ 2 I m 2 , . . . , ς ′ where ς ′ 1 > ς ′ 2 > . . . > ς ′ k > 0 . The original problem is transformed to � � U ′ T � A V ′ � � � � b A − → b = , b Σ b ≡ U ′ T b . where � 19

Step 2: Transformation of the right-hand side b Split � b horizontally with respect to the multiplicities of the singular values of A , ⎡ ⎤ ς ′ � b 1 1 I m 1 0 ⎢ ⎥ ς ′ � ⎢ ⎥ b 2 2 I m 2 0 ⎢ ⎥ � � . . ... ⎢ ⎥ . . � . . = . b Σ ⎢ ⎥ ⎢ ⎥ ς ′ � ⎢ ⎥ 0 b k k I m k ⎣ ⎦ � b k +1 0 0 · · · 0 0 There exist Householder transformation matrices H j such that ⎡ ⎤ � � xxx b j � 2 ⎢ ⎥ 0 ⎢ ⎥ H T ⎦ ∈ R m j , j � j = 1 , . . . , k + 1 , b j = ⎢ ⎥ . . . ⎣ where m k +1 ≡ m − rank( A ) . 0 20

The Total Least Squares Problem with Multiple Right-Hand Sides A X - PowerPoint PPT Presentation

The Total Least Squares Problem with Multiple Right-Hand Sides A X B Martin Ple singer in collaboration with Iveta Hn etynkov a, Diana Maria Sima, Zden ek Strako s, and Sabine Van Huffel FP & FM TU Liberec, ICS AS CR,

Practical Least-Squares for Computer Graphics Siggraph Course 11 Siggraph Course 11 Practical

Statistical Properties of the Regularized Least Squares Functional and a hybrid LSQR Newton method

Least Mean Squares Regression Machine Learning 1 Least Squares Method for regression

The Mathemagic of Magic Squares History of Magic Squares Mathematics and Magic Squares

Right Hand Coordinate System Right Hand Rule A rectangular or Cartesian coordinate system

Non linear Least Squares Lectures for PHD course on Numerical optimization Enrico Bertolazzi

ECE 516: Adaptive Digital Filters Lecture 13 (Recursive Least-Squares) Mojtaba Soltanalian 2

Statistical Geometry Processing Winter Semester 2011/2012 Least-Squares Least-Squares Fitting

9. Equality constraints and tradeoffs More least squares Example: moving average model

8. Least squares Review of linear equations Least squares Example: curve-fitting

Linear Least Squares I Steve Marschner Cornell CS 322 Cornell CS 322 Linear Least Squares I 1

Moving Least Squares Outline The Approximation Power of Moving Least- Squares D. Levin

The Chi-squared Distribution of the Regularized Least Squares Functional for Regularization

Geometry of Least Squares 2 Least squares from the

Solving Regularized Total Least Squares Problems Based on Eigensolvers Heinrich Voss

List of hand outs for this session Hand out 1: Incident decision tree Hand out 2: Yorkshire

Random matrix ensembles for quantum spins and decoherence Franois David IPhT Saclay & CNRS

A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC ENRICHMENT OF DIGITAL CONTENT (NEW L10N BUSINESS

Learning Representations of Relational Data Sebastijan Dumani DTAI, CS Department, KU Leuven

Ramsey regularity, MAD families, and their relatives David Schrittesser (KGRC) Joint work with

Promoting Education under Distortionary Taxation: A Comparison between Equality of Opportunity

Random Processes DS GA 1002 Probability and Statistics for Data Science

News, Stock Prices and Economic Fluctuations Paul Beaudry & Franck Portier University of

CS 1501 www.cs.pitt.edu/~nlf4/cs1501/ Introduction Meta-notes These notes are intended for use

The Total Least Squares Problem with Multiple Right-Hand Sides A X - PowerPoint PPT Presentation

The Total Least Squares Problem with Multiple Right-Hand Sides A X B Martin Ple singer in collaboration with Iveta Hn etynkov a, Diana Maria Sima, Zden ek Strako s, and Sabine Van Huffel FP & FM TU Liberec, ICS AS CR,

Practical Least-Squares for Computer Graphics Siggraph Course 11 Siggraph Course 11 Practical

Statistical Properties of the Regularized Least Squares Functional and a hybrid LSQR Newton method

Least Mean Squares Regression Machine Learning 1 Least Squares Method for regression

The Mathemagic of Magic Squares History of Magic Squares Mathematics and Magic Squares

Right Hand Coordinate System Right Hand Rule A rectangular or Cartesian coordinate system

Non linear Least Squares Lectures for PHD course on Numerical optimization Enrico Bertolazzi

ECE 516: Adaptive Digital Filters Lecture 13 (Recursive Least-Squares) Mojtaba Soltanalian 2

Statistical Geometry Processing Winter Semester 2011/2012 Least-Squares Least-Squares Fitting

9. Equality constraints and tradeoffs More least squares Example: moving average model

8. Least squares Review of linear equations Least squares Example: curve-fitting

Linear Least Squares I Steve Marschner Cornell CS 322 Cornell CS 322 Linear Least Squares I 1

Moving Least Squares Outline The Approximation Power of Moving Least- Squares D. Levin

The Chi-squared Distribution of the Regularized Least Squares Functional for Regularization

Geometry of Least Squares 2 Least squares from the

Solving Regularized Total Least Squares Problems Based on Eigensolvers Heinrich Voss

List of hand outs for this session Hand out 1: Incident decision tree Hand out 2: Yorkshire

Random matrix ensembles for quantum spins and decoherence Franois David IPhT Saclay &amp; CNRS

A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC ENRICHMENT OF DIGITAL CONTENT (NEW L10N BUSINESS

Learning Representations of Relational Data Sebastijan Dumani DTAI, CS Department, KU Leuven

Ramsey regularity, MAD families, and their relatives David Schrittesser (KGRC) Joint work with

Promoting Education under Distortionary Taxation: A Comparison between Equality of Opportunity

Random Processes DS GA 1002 Probability and Statistics for Data Science

News, Stock Prices and Economic Fluctuations Paul Beaudry &amp; Franck Portier University of

CS 1501 www.cs.pitt.edu/~nlf4/cs1501/ Introduction Meta-notes These notes are intended for use

Random matrix ensembles for quantum spins and decoherence Franois David IPhT Saclay & CNRS

News, Stock Prices and Economic Fluctuations Paul Beaudry & Franck Portier University of