Rate-Optimal Perturbation Bounds for Singular Subspaces with - PowerPoint PPT Presentation

Rate-Optimal Perturbation Bounds for Singular Subspaces with Applications to High-Dimensional Statistics Anru Zhang Department of Statistics University of Wisconsin – Madison

Introduction Introduction • Focus: singular value decomposition (SVD) X = U · Σ 1 · V ⊤ + U ⊥ · Σ 2 · V ⊤ ⊥ • Due to perturbation, ˆ X = X + Z , SVD is altered to V ⊤ + ˆ U · ˆ U ⊥ · ˆ V ⊤ X = ˆ ˆ Σ 1 · ˆ Σ 2 · ˆ ⊥ . Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 2

Introduction Introduction close ˆ V to V (or ˆ small perturbation + large signal → U and U ) • Problem: Perturbation Bounds on Singular Subspaces ◮ How to quantify the difference between ˆ V and V (or ˆ U and U )? ◮ Is there any upper bounds for the difference? ◮ Are U and ˆ U , V and ˆ V equally different? • Motivation : spectral method , which has been used in a wide range of modern high-dimensional statistical problems, utilize this property. Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 3

Introduction Application 1: Low-rank Matrix Denoising ˆ X = X + Z , Z iid ∼ sub-Gaussian (0 , σ 2 ) X is approximately rank-r , • Target: X , U or V . • Specific applications ◮ Magnetic Resonance Imaging (MRI) (Cand` es, Sing-Long and Trzasko, 2012); ◮ Relaxometry (Bydder and Du, 2006) U , ˆ ˆ V , the first r singular vectors of ˆ • Natural estimators for U , V : X . • Q: How do ˆ U , ˆ V perform, respectively? Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 4

Introduction Application 2: High-dimensional Clustering • Observe n points X 1 , . . . , X n ∈ R p , p ≥ n . • Each point belongs to one of two classes (Jin, Ke and Wang, 2015) iid X i = µ l i + ε i ∈ R p , ∼ sub-Gaussian (0 , σ 2 I p ) , i = 1 , . . . , n , ε i µ ∈ R p is the mean . l i ∈ {− 1 , 1 } are labels ; • Goal: recover labels l . Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 5

Introduction Other Applications • In addition, spectral method is often applied to find a “warm start” for more delicate iterative algorithms. ◮ phase retrieval (Cai, Li and Ma, 2016) ◮ matrix completion (Sun and Luo, 2015) ◮ community detection (Jin, 2015) Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 6

Introduction Other Applications Other applications of spectral methods include • community detection • matrix completion • principle component analysis • canonical correlation analysis • ... Specific practices include • collaborative filtering (the Netflix problem) • multi-task learning • system identification • sensor localization • ... Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 7

Perturbation Bounds for Singular Subspaces Problem Formulation X = U · Σ 1 · V ⊤ + U ⊥ · Σ 2 · V ⊤ ⊥ V ⊤ + ˆ U · ˆ U ⊥ · ˆ V ⊤ ˆ X = ˆ ˆ Σ 1 · ˆ Σ 2 · ˆ X = X + Z , ⊥ • Target: Measure the difference between ˆ V and V ( ˆ U and U ) Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 8

Perturbation Bounds for Singular Subspaces sin Θ Distance of Singular Sub-spaces Definition of sin Θ distances : • Suppose V ⊤ ˆ V have singular values σ 1 ≥ σ 2 ≥ · · · ≥ σ r ≥ 0 . • Define the sine principle angles as � � sin Θ ( V , ˆ 1 − σ 2 1 − σ 2 V ) = diag ( r ) . 1 , . . . , • Quantitative measure of distance: � sin Θ ( ˆ V , V ) � and � sin Θ ( ˆ V , V ) � F . Good properties : • Triangular inequality → indeed a distance; • Many other distances are equivalent → convenient to use. Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 9

Perturbation Bounds for Singular Subspaces Classic Results of Perturbation Bounds • The Perturbation bounds: develop the upper bound for � sin Θ ( V , ˆ � sin Θ ( U , ˆ � sin Θ ( V , ˆ � sin Θ ( U , ˆ V ) � , U ) � , V ) � F , U ) � F . • This problem has been widely studied in the literature (Davis and Kahan, 1970; Wedin, 1972; Weyl, 1912; Stewart, 1991, 2006; Yu et al., 2015; Fan, Wang and Zhong, 2016). • Classical tools: ◮ Davis and Kahan (1970): eigenvectors of symmetric matrices; ◮ Wedin (1972): singular vectors for asymmetric matrices. Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 10

Perturbation Bounds for Singular Subspaces Classic Result: Wedin’s sin Θ Theorem X = U · Σ 1 · V ⊤ + U ⊥ · Σ 2 · V ⊤ ⊥ V ⊤ + ˆ X = ˆ ˆ U · ˆ Σ 1 · ˆ U ⊥ · ˆ Σ 2 · ˆ V ⊤ ⊥ Wedin’s sin Θ Theorem (1972) states that if σ min (ˆ Σ 1 ) − σ max ( Σ 2 ) = δ > 0 , � � � Z ˆ V � , � ˆ U ⊤ Z � max � � � sin Θ ( V , ˆ V ) � , � sin Θ ( U , ˆ U ) � ≤ max . δ • joint upper bound for both ˆ U and ˆ V ; • may be sub-optimal. Figure: Intuitively, estimating V is more difficult than U for the matrix above. Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 11

Perturbation Bounds for Singular Subspaces Unilateral Perturbation Bound • Decompose � � V ⊤ � � Z 11 � Z 12 � Z = U U ⊥ . V ⊤ Z 21 Z 22 ⊥ Z 11 = U ⊤ ZV , Z 21 = U ⊥ ZV ⊤ , Z 12 = U ⊤ ZV ⊥ , Z 22 = U ⊥ ZV ⊥ . Define z ij : = � Z ij � for i , j = 1 , 2 . Theorem (Unilateral Perturbation Bound (Cai & Z. 2016)) Denote α : = σ min ( U ⊤ ˆ XV ⊥ ) . If α 2 > β 2 + z 2 ⊥ ˆ XV ) , β : = σ max ( U ⊤ 12 ∧ z 2 21 , then α z 12 + β z 21 � sin Θ ( V , ˆ V ) � ≤ ∧ 1 , α 2 − β 2 − z 2 21 ∧ z 2 12 α z 21 + β z 12 � sin Θ ( U , ˆ U ) � ≤ ∧ 1 . α 2 − β 2 − z 2 21 ∧ z 2 12 Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 12

Perturbation Bounds for Singular Subspaces Remark • Since α > β , α z 12 + β z 21 α z 21 + β z 12 if z 12 > z 21 , > , α 2 − β 2 − z 2 α 2 − β 2 − z 2 21 ∧ z 2 21 ∧ z 2 12 12 vice versa. • When α ≫ max( β, � Z � ) , the upper bound is approximately V ) � ≤ z 12 U ) � ≤ z 21 � sin Θ ( V , ˆ � sin Θ ( U , ˆ α , α . In contrast, Wedin’s sin Θ law only leads to V ) � ≤ � Z � U ) � ≤ � Z � � sin Θ ( V , ˆ � sin Θ ( U , ˆ α , α . • The upper bound in Frobenius norm sin Θ norm can be derived similarly. Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 13

Perturbation Bounds for Singular Subspaces Idea Behind � I r � � I r � . Let us take a look at ˆ Assume U = , V = X . 0 0 • When estimating U , z 12 becomes “signal” while z 21 becomes “noise.” • When estimating V , z 12 becomes “noise” while z 21 becomes “signal.” Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 14

Perturbation Bounds for Singular Subspaces Lower Bound Theorem (Perturbation Lower Bound) Define the class of p 1 × p 2 rank- r matrices and perturbations, � F r ,α,β, z 21 , z 12 = ( X , Z ) : rank ( X ) = r , � σ min ( U ⊺ ˆ XV ) ≥ α, � Z 22 � ≤ β, � Z 12 � ≤ z 12 , � Z 21 � ≤ z 21 . Provided that α 2 > β 2 + z 2 21 , r < p 1 ∧ p 2 12 + z 2 , 2   1 α z 12 + β z 21 � � � sin Θ ( V , ˜ � ≥   inf sup V ) ∧ 1  , √   � �   α 2 − β 2 − z 2   12 ∧ z 2 ˜  2 10 V ( X , Z ) ∈F α,β, z 21 , z 12 21   1 α z 21 + β z 12 � � � sin Θ ( U , ˜   � ≥ inf sup U ) √ ∧ 1  . � �     α 2 − β 2 − z 2  12 ∧ z 2   ˜ 2 10 U ( X , Z ) ∈F α,β, z 21 , z 12 21 Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 15

Applications Matrix Denoising Application: Matrix Denoising ˆ X = X + Z , Z iid X is rank-r , ∼ sub-Gaussian (0 , 1) • Target: U or V . • Natural estimators for U , V : ˆ U , ˆ V , the first r singular vectors of ˆ X . • Q: How do ˆ U , ˆ V perform, respectively? Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 16

Applications Matrix Denoising • The r -th singular value of X , σ r ( X ) , is a good characterization for the difficulty of this problem. • Applying the perturbation bound, we obtain Theorem Suppose X = U · Σ · V ⊤ ∈ R p 1 × p 2 is of rank- r . Then � 2 ≤ C ( p 2 σ 2 r ( X ) + p 1 p 2 ) � � � sin Θ ( V , ˆ E V ) ∧ 1 , � � σ 4 r ( X ) � 2 ≤ C ( p 1 σ 2 r ( X ) + p 1 p 2 ) � � � sin Θ ( U , ˆ ∧ 1 . E U ) � � σ 4 r ( X ) Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 17

Applications Matrix Denoising Define the following class of low-rank matrices F r , t = � X ∈ R p 1 × p 2 : rank ( X ) = r , σ r ( X ) ≥ t � . Theorem (Lower Bound) If r ≤ p 1 16 ∧ p 2 2 , then � p 2 t 2 + p 1 p 2 � V ) � 2 ≥ c E � sin Θ ( V , ˜ inf sup ∧ 1 , t 4 ˜ V X ∈F r , t � p 1 t 2 + p 1 p 2 � U ) � 2 ≥ c E � sin Θ ( U , ˜ inf sup ∧ 1 . t 4 ˜ V X ∈F r , t To sum up, � p 2 t 2 + p 1 p 2 � V ) � 2 ≍ E � sin Θ ( V , ˜ inf sup ∧ 1 , t 4 ˜ V X ∈F r , t � p 1 t 2 + p 1 p 2 � U ) � 2 ≍ E � sin Θ ( U , ˜ inf sup ∧ 1 . t 4 ˜ V X ∈F r , t Anru Zhang (UW-Madison) Perturbation Bounds for Singular Subspaces 18

Rate-Optimal Perturbation Bounds for Singular Subspaces with - PowerPoint PPT Presentation

Rate-Optimal Perturbation Bounds for Singular Subspaces with Applications to High-Dimensional Statistics Anru Zhang Department of Statistics University of Wisconsin Madison Introduction Introduction Focus: singular value decomposition

Labor Classification Yrs Rate 1 Rate 2 Rate 3 Rate 4 Rate 5 Rate 6 Rate 7 Rate 8 Rate 9

Using Geometric Singular Perturbation Theory to Understand Singular Shocks Barbara Lee Keyfitz

Privacy Preserving Data Mining: Additive Data Perturbation Outline Input perturbation

Circuit Lower-bounds Lecture 24 Weak circuits are indeed weak 1 Circuit Lower-bounds 2

The singular perturbation phenomenon and the turnpike property in optimal control Boris WEMBE,

Whats so great about Krylov subspaces? David S. Watkins Department of Mathematics Washington

Quiz Describe the two most important ways in which subspaces of F D arise. (These ways were

Subspaces and the Three Matrix Spaces Subspaces Defn. A subspace of a vector space V is a subset

SYMBOLIC LOGIC UNIT 10: SINGULAR SENTENCES Singular Sentences (monadic) Paris is beautiful

[11] The Singular Value Decomposition The Singular Value Decomposition Gene Golubs license

/ Link Invariants from Braided Monoidal On the PROB of Singular Braids Categories Singular

Descriptive and combinatorial set theory Introduction Singular cardinals, at singular cardinals

Singular Value Decomposition Presented by Matthew Motoki 1 What is a singular value

Variational Perturbation Theory Variational Perturbation Theory Hagen Kleinert, FU BERLIN

Harmonic Oscillator with x 3 perturbation 0.3 0.25 0.2 0.15 0.1 0.05 -0.4 -0.2 0.2 0.4

SINGULAR PERTURBATION OF POLYNOMIAL POTENTIALS AND REAL SPECTRAL LOCI Alexandre Eremenko and

Computer Graphics - Volume Rendering - Philipp Slusallek Overview Motivation Volume

Inverse Problems Recovering x 0 R N from noisy observations y = x 0 + w R P Inverse

The world of The world of tiny nuclear magnets tiny nuclear magnets T. G.

Image analysis Biomedical images are analyzed in order to get information of interest with a non

Visualizing Outliers in High Dimensional Functional Data for task fMRI data Exploration Yasser

Quantitative MRI using Model- based CS Mike Davies University of Edinburgh CSA 2015 :

The FASK algorithm FASK (Fast Adjacency Skewness) appeals to Skewness. It runs the Fast

Spatial Statistical Inference in Functional Modeling fMRI data Magnetic Resonance Imaging (fMRI)

Sambuz

Useful Links

Newsletter

Mail Us