Uncertainty quantification for nonconvex tensor completion Yuxin - PowerPoint PPT Presentation

Uncertainty quantification for nonconvex tensor completion Yuxin Chen Electrical Engineering, Princeton University

Changxiao Cai H. Vincent Poor Princeton EE Princeton EE

Ubiquity of high-dimensional tensor data computational genomics dynamic MRI — fig. credit: Schreiber et al. 19 — fig. credit: Liu et al. 17 3/ 21

Challenges in tensor reconstruction a tensor of interest 4/ 21

Challenges in tensor reconstruction a tensor of interest mising data 4/ 21

Challenges in tensor reconstruction a tensor of interest mising data noise 4/ 21

Key to enabling reliable reconstruction from incomplete & noisy data: — exploiting low (CP) rank structure 5/ 21

Noisy tensor completion 6/ 21

Mathematical model T obs T ⋆ • unknown rank- r tensor T ⋆ ∈ R d × d × d r T ⋆ = � u ⋆ i ⊗ u ⋆ i ⊗ u ⋆ i i =1 7/ 21

Mathematical model T obs T ⋆ • unknown rank- r tensor T ⋆ ∈ R d × d × d r T ⋆ = � u ⋆ i ⊗ u ⋆ i ⊗ u ⋆ i i =1 • partial observations over a sampling set Ω T obs i,j,k = T ⋆ i,j,k + noise , ( i, j, k ) ∈ Ω 7/ 21

Mathematical model T obs T ⋆ • unknown rank- r tensor T ⋆ ∈ R d × d × d r T ⋆ = � u ⋆ i ⊗ u ⋆ i ⊗ u ⋆ i i =1 • partial observations over a sampling set Ω T obs i,j,k = T ⋆ i,j,k + noise , ( i, j, k ) ∈ Ω • goal: estimate { u ⋆ i } r i =1 and T ⋆ 7/ 21

Prior art sum-of-squares hierarchy convex relaxation spectral methods nonconvex optimization 8/ 21

Prior art • Gandy, Recht, Yamada ’11 • Liu, Musialski, Wonka, Ye ’12 • Kressner, Steinlechner, Vandereycken ’13 • Xu, Hao, Yin, Su ’13 • Romera-Paredes, Pontil ’13 • Jain, Oh ’14 • Huang, Mu, Goldfarb, Wright ’15 • Barak, Moitra ’16 • Zhang, Aeron ’16 • Yuan, Zhang ’16 • Montanari, Sun ’16 • Kasai, Mishra ’16 • Potechin, Steurer ’17 • Dong, Yuan, Zhang ’17 • Xia, Yuan ’19 • Zhang ’19 • Cai, Li, Poor, Chen ’19 • Cai, Li, Chi, Poor, Chen ’19 • Liu, Moitra ’20 • . . . 9/ 21

A nonconvex approach: Cai et al. (NeurIPS 19) � 2 �� r � � s =1 u ⊗ 3 i,j,k − T obs U =[ u 1 , ··· , u r ] ∈ R d × r f ( U ) := minimize s i,j,k ( i,j,k ) ∈ Ω � �� squared loss 1. estimating subspace spanned by low-rank tensor factors — unfolding + spectral methods — iteratively each tensor factor via • proper initializaiton: U 0 2. successive retrieval of tensor factors • gradient descent: for t = 0 , 1 , · · · from subspace estimates — iteratively each tensor factor via — random projection + spectral methods U t +1 = U t − η t ∇ f ( U t ) 3. gradient descent (nonconvex) — random projection + sp — constant learning rates 10/ 21

A nonconvex approach: Cai et al. (NeurIPS 19) 10 -1 10 -2 10 -3 0 5 10 15 20 25 30 Under mild conditions, this nonconvex algorithm achieves • linear convergence • minimax-optimal statistical accuracy (up to log factor) 11/ 21

One step further: reasoning about uncertainty? tensor c sor completion How to to assess unce 12/ 21

One step further: reasoning about uncertainty? tensor c sor completion How to to assess unce How to assess uncertainty, or “confidence”, of obtained estimates due to imperfect data acquisition? • noise • incomplete measurements • · · · 12/ 21

Challenges � 2 �� r � � s =1 u ⊗ 3 i,j,k − T obs U =[ u 1 , ··· , u r ] ∈ R d × r f ( U ) := minimize s i,j,k ( i,j,k ) ∈ Ω � �� squared loss • how to pin down distributions of nonconvex solutions? 13/ 21

Challenges � 2 �� r � � s =1 u ⊗ 3 i,j,k − T obs U =[ u 1 , ··· , u r ] ∈ R d × r f ( U ) := minimize s i,j,k ( i,j,k ) ∈ Ω � �� squared loss • how to pin down distributions of nonconvex solutions? • how to adapt to unknown noise distributions and heteroscedasticity (i.e. location-varying noise variance)? 13/ 21

Challenges � 2 �� r � � s =1 u ⊗ 3 i,j,k − T obs U =[ u 1 , ··· , u r ] ∈ R d × r f ( U ) := minimize s i,j,k ( i,j,k ) ∈ Ω � �� squared loss • how to pin down distributions of nonconvex solutions? • how to adapt to unknown noise distributions and heteroscedasticity (i.e. location-varying noise variance)? • existing estimation guarantees are highly insufficient − → overly wide confidence intervals 13/ 21

Assumptions r � T ⋆ = u ⋆ i ⊗ u ⋆ i ⊗ u ⋆ i ∈ R d × d × d i =1 • random sampling : each entry is observed independently with prob. p � polylog( d ) d 3 / 2 • random noise : independent zero-mean sub-Gaussian with variance of roughly the same order (but not identical) • ground truth : low-rank ( r = O (1) ), incoherent (tensor factors are de-localized and nearly orthogonal to each other), and well-conditioned 14/ 21

Main results: distributional theory 3 U 2 • random sampling 1 • independent sub-Gaussian noise 0 -1 • ground truth: low-rank, incoherent, -2 well-conditioned -3 -3 -2 -1 0 1 2 3 Theorem 1 With high prob., there exists permutation matrix Π ∈ R r × r s.t. U Π − U ⋆ ∼ N ( 0 , Cram´ er-Rao ) + negligible term — asymptotically optimal 15/ 21

Main results: distributional theory 3 T 2 • random sampling 1 • independent sub-Gaussian noise 0 -1 • ground truth: low-rank, incoherent, -2 well-conditioned -3 -3 -2 -1 0 1 2 3 Theorem 2 Consider any ( i, j, k ) s.t. the corresponding “SNR” is not exceedingly small. Then with high prob., T i,j,k − T ⋆ i,j,k ∼ N (0 , Cram´ er-Rao ) + negligible term — asymptotically optimal 15/ 21

• Gaussianality and optimality: estimation error of nonconvex approach is zero-mean Gaussian, who (co)-variance is “minimal” 16/ 21

0.25 0.25 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 0 0 -3 -2 -1 0 1 2 3 -2 -1 0 1 2 3 tensor factor entry tensor entry • Gaussianality and optimality: estimation error of nonconvex approach is zero-mean Gaussian, who (co)-variance is “minimal” • Confidence intervals: error (co)-variance can be accurately estimated, leading to valid CI construction 16/ 21

0.25 0.25 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 0 0 -3 -2 -1 0 1 2 3 -2 -1 0 1 2 3 tensor factor entry tensor entry • Gaussianality and optimality: estimation error of nonconvex approach is zero-mean Gaussian, who (co)-variance is “minimal” • Confidence intervals: error (co)-variance can be accurately estimated, leading to valid CI construction • Adaptivity: our procedure is data-driven, fully adaptive to unknown noise levels and heteroscedasticity 16/ 21

Empirical coverage rates (CR) tensor factor tensor entries ( r, σ ) Mean ( CR ) Std ( CR ) ( r, σ ) Mean ( CR ) Std ( CR ) (2 , 10 − 2 ) (2 , 10 − 2 ) 0 . 9481 0 . 0201 0 . 9494 0 . 0218 (2 , 10 − 1 ) 0 . 9477 0 . 0228 (2 , 10 − 1 ) 0 . 9513 0 . 0218 (2 , 1) 0 . 9478 0 . 0215 (2 , 1) 0 . 9475 0 . 0222 (4 , 10 − 2 ) (4 , 10 − 2 ) 0 . 9450 0 . 0218 0 . 9434 0 . 0225 (4 , 10 − 1 ) 0 . 9472 0 . 0231 (4 , 10 − 1 ) 0 . 9494 0 . 0220 (4 , 1) 0 . 9462 0 . 0234 (4 , 1) 0 . 9494 0 . 0219 d = 100 , p = 0 . 2 , heteroscedastic 17/ 21

Back to estimation: ℓ 2 optimality Distributional theory in turn allows us to track estimation accuracy 18/ 21

Back to estimation: ℓ 2 optimality Distributional theory in turn allows us to track estimation accuracy Theorem 3 Suppose noise is i.i.d. Gaussian. ∃ some permutation π ( · ) s.t. (2 + o (1)) σ 2 d � u π ( l ) − u ⋆ l � 2 2 = , 1 ≤ l ≤ r p � u ⋆ l � 4 2 � �� Cram er-Rao lower bound (6 + o (1)) σ 2 rd � T − T ⋆ � 2 F = p � �� Cram´ er-Rao lower bound 18/ 21

Back to estimation: ℓ 2 optimality Distributional theory in turn allows us to track estimation accuracy Theorem 3 Suppose noise is i.i.d. Gaussian. ∃ some permutation π ( · ) s.t. (2 + o (1)) σ 2 d � u π ( l ) − u ⋆ l � 2 2 = , 1 ≤ l ≤ r p � u ⋆ l � 4 2 � �� Cram er-Rao lower bound (6 + o (1)) σ 2 rd � T − T ⋆ � 2 F = p � �� Cram´ er-Rao lower bound • precise characterization of estimation accuracy • achieves full statistical efficiency (including pre-constant) 18/ 21

Numerical ℓ 2 errors vs. Cram´ er–Rao bounds 10 -5 10 1 10 -6 10 0 10 -7 10 -1 10 -8 10 -2 10 -3 10 -2 10 -1 10 0 10 -3 10 -2 10 -1 10 0 tensor factor estimation tensor estimation r = 4 , p = 0 . 2 , d = 100 19/ 21

Concluding remarks sor estimation ar-optimal s lity guarantees • ion nonconvex op • fast, adaptive to unknown noise levels ex optimization a nd uncertainty qu lly optimal u al uncertainty quantification 20/ 21

Concluding remarks sor estimation ar-optimal s lity guarantees • ion nonconvex op • fast, adaptive to unknown noise levels ex optimization a nd uncertainty qu lly optimal u al uncertainty quantification future directions • improve dependency on rank & cond. number • more general sampling patterns • other tensor-type problems 20/ 21

Uncertainty quantification for nonconvex tensor completion Yuxin - PowerPoint PPT Presentation

Uncertainty quantification for nonconvex tensor completion Yuxin Chen Electrical Engineering, Princeton University Changxiao Cai H. Vincent Poor Princeton EE Princeton EE Ubiquity of high-dimensional tensor data computational genomics

8. Tensor Field Visualization Tensor: extension of concept of scalar and vector Tensor data

Semi-intrusive Uncertainty Quantification for Multiscale models Anna Nikishova 1 Alfons Hoekstra 1

Lecture 15: Exact Tensor Completion Joint Work with David Steurer Lecture Outline Part I:

QUANTIFICATION OF PORE QUANTIFICATION OF PORE QUANTIFICATION OF PORE STRUCTURE CHARACTERISTICS

Uncertainty AIMA Chapter 13 Outline Uncertainty Uncertainty Probability Syntax and

(Some) Challenges in (Some) Challenges in Tensor Mining Tensor Mining Evrim Acar Sandia

Tensor Field Techniques Lecture 11 March 5, 2020 Outline Basics of tensor algebra Tensor

TENSOR ALGEBRA Continuum Mechanics Course (MMC) - ETSECCPB - UPC Introduction to Tensors Tensor

Tensor-Matrix Products with a Compressed Sparse Tensor Shaden Smith George Karypis University

Tensor Field Visualization 9-1 Ronald Peikert SciVis 2007 - Tensor Fields Tensors

The Role of Expert Knowledge in Uncertainty Quantification (Are We Adding More Uncertainty (Are

A Non-parametric Approach for Uncertainty Quantification in Elastodynamics S Adhikari

Overview of Uncertainty Quantification Algorithm R&D in the DAKOTA Project Michael S. Eldred

Metamodels in Uncertainty Quantification and Reliability Analysis S. Marelli and B. Sudret Chair

Interval Based Finite Elements for Uncertainty Quantification in Engineering Mechanics Rafi L.

Uncertainty Quantification in Materials Modeling Pablo Seleson Oak Ridge National Laboratory

The Joint Effort for Data assimilation Integration (JEDI) OOPS Observation Space Joint Center

The Joint Effort for Data assimilation Integration (JEDI) IODA Subsystem Joint Center for

Latent Models: Sequence Models Beyond HMMs and Machine Translation Alignment CMSC 473/673 UMBC

Towards Verified Stochastic Variational Inference for Probabilistic Programs Wonyeol Lee 1

NDN, CoAP, and MQTT: A Comparative Measurement Study in the IoT ACM ICN 2018, Boston Cenk

Hindsight Experience Replay Practice Environment Siddharth Ancha, Nicholay Topin MLD, Carnegie

Estimation of Dynamic Discrete Choice Models by Maximum Likelihood and the Simulated Method of

Provably Live Exception Handling Bart Jacobs DistriNet, KU Leuven FTfJP 2015 Bart Jacobs

Uncertainty quantification for nonconvex tensor completion Yuxin - PowerPoint PPT Presentation

Uncertainty quantification for nonconvex tensor completion Yuxin Chen Electrical Engineering, Princeton University Changxiao Cai H. Vincent Poor Princeton EE Princeton EE Ubiquity of high-dimensional tensor data computational genomics

8. Tensor Field Visualization Tensor: extension of concept of scalar and vector Tensor data

Semi-intrusive Uncertainty Quantification for Multiscale models Anna Nikishova 1 Alfons Hoekstra 1

Lecture 15: Exact Tensor Completion Joint Work with David Steurer Lecture Outline Part I:

QUANTIFICATION OF PORE QUANTIFICATION OF PORE QUANTIFICATION OF PORE STRUCTURE CHARACTERISTICS

Uncertainty AIMA Chapter 13 Outline Uncertainty Uncertainty Probability Syntax and

(Some) Challenges in (Some) Challenges in Tensor Mining Tensor Mining Evrim Acar Sandia

Tensor Field Techniques Lecture 11 March 5, 2020 Outline Basics of tensor algebra Tensor

TENSOR ALGEBRA Continuum Mechanics Course (MMC) - ETSECCPB - UPC Introduction to Tensors Tensor

Tensor-Matrix Products with a Compressed Sparse Tensor Shaden Smith George Karypis University

Tensor Field Visualization 9-1 Ronald Peikert SciVis 2007 - Tensor Fields Tensors

The Role of Expert Knowledge in Uncertainty Quantification (Are We Adding More Uncertainty (Are

A Non-parametric Approach for Uncertainty Quantification in Elastodynamics S Adhikari

Overview of Uncertainty Quantification Algorithm R&amp;D in the DAKOTA Project Michael S. Eldred

Metamodels in Uncertainty Quantification and Reliability Analysis S. Marelli and B. Sudret Chair

Interval Based Finite Elements for Uncertainty Quantification in Engineering Mechanics Rafi L.

Uncertainty Quantification in Materials Modeling Pablo Seleson Oak Ridge National Laboratory

The Joint Effort for Data assimilation Integration (JEDI) OOPS Observation Space Joint Center

The Joint Effort for Data assimilation Integration (JEDI) IODA Subsystem Joint Center for

Latent Models: Sequence Models Beyond HMMs and Machine Translation Alignment CMSC 473/673 UMBC

Towards Verified Stochastic Variational Inference for Probabilistic Programs Wonyeol Lee 1

NDN, CoAP, and MQTT: A Comparative Measurement Study in the IoT ACM ICN 2018, Boston Cenk

Hindsight Experience Replay Practice Environment Siddharth Ancha, Nicholay Topin MLD, Carnegie

Estimation of Dynamic Discrete Choice Models by Maximum Likelihood and the Simulated Method of

Provably Live Exception Handling Bart Jacobs DistriNet, KU Leuven FTfJP 2015 Bart Jacobs

Overview of Uncertainty Quantification Algorithm R&D in the DAKOTA Project Michael S. Eldred