Rotation Invariant Householder Parameterization for Bayesian PCA - PowerPoint PPT Presentation

Rotation Invariant Householder Parameterization for Bayesian PCA Rajbir-Singh Nirwan, Nils Bertschinger June 11, 2019

Outline • Probabilistic PCA (PPCA) • Non-identifiability issue of PPCA • Conceptual solution to the problem • Implementation • Results � 2

Probabilistic PCA • Classical PCA Formulated as a projection from data space Y to a lower dimensional latent space X Y ∈ ℝ N × D X ∈ ℝ N × Q → Latent space: maximizes variance of projected data, minimizes MSE � 3

Probabilistic PCA • Classical PCA Formulated as a projection from data space Y to a lower dimensional latent space X Y ∈ ℝ N × D X ∈ ℝ N × Q → Latent space: maximizes variance of projected data, minimizes MSE • Probabilistic PCA (PPCA) Viewed as a generative model, that maps the latent space X to the data space Y X ∈ ℝ N × Q Y ∈ ℝ N × D → Y = XW T + ϵ ϵ ∼ 𝒪 ( 0 , σ 2 I ) X ∼ 𝒪 ( 0 , I ), N 𝒪 ( Y n ,: | 0 , WW T + σ 2 I ) ∏ p ( Y | W ) = n =1 WRR T W T = WW T ∀ RR T = I � 4

Probabilistic PCA • Classical PCA Formulated as a projection from data space Y to a lower dimensional latent space X Y ∈ ℝ N × D X ∈ ℝ N × Q → Latent space: maximizes variance of projected data, minimizes MSE • Probabilistic PCA (PPCA) Viewed as a generative model, that maps the latent space X to the data space Y X ∈ ℝ N × Q Y ∈ ℝ N × D → • Optimization for D=5, Q=2 Y = XW T + ϵ ϵ ∼ 𝒪 ( 0 , σ 2 I ) X ∼ 𝒪 ( 0 , I ), N 𝒪 ( Y n ,: | 0 , WW T + σ 2 I ) ∏ p ( Y | W ) = n =1 WRR T W T = WW T ∀ RR T = I � 5

Probabilistic PCA • Classical PCA Formulated as a projection from data space Y to a lower dimensional latent space X Y ∈ ℝ N × D X ∈ ℝ N × Q → Latent space: maximizes variance of projected data, minimizes MSE • Probabilistic PCA (PPCA) Viewed as a generative model, that maps the latent space X to the data space Y X ∈ ℝ N × Q Y ∈ ℝ N × D → • Rotation invariant likelihood Y = XW T + ϵ optimized 2 ϵ ∼ 𝒪 ( 0 , σ 2 I ) X ∼ 𝒪 ( 0 , I ), 1 N W 2 0 𝒪 ( Y n ,: | 0 , WW T + σ 2 I ) ∏ p ( Y | W ) = −1 n =1 −2 WRR T W T = WW T ∀ RR T = I −2 −1 0 1 2 � 8 W 1

Bayesian approach to PPCA p ( Y | W ) p ( W ) p ( W | Y ) = p ( Y ) • If prior does not break the symmetry, posterior will be rotation invariant as well • Sampling will be challenging, posterior averages are meaningless and the interpretation of the latent space is almost impossible � 9

Bayesian approach to PPCA sampled 2 p ( Y | W ) p ( W ) 1 p ( W | Y ) = W 2 0 p ( Y ) −1 −2 −2 −1 0 1 2 W 1 • If prior does not break the symmetry, posterior will be rotation invariant as well • Sampling will be challenging, posterior averages are meaningless and the interpretation of the latent space is almost impossible � 10

Solution • Find di ff erent parameterization of the model, such that the probabilistic model is not changed Outline of procedure WW T = U Σ V T ( U Σ V T ) T = U Σ 2 U T • SVD of W • Fix coordinate system V = I • Specify correct prior p ( U , Σ ) • Sample from p ( U , Σ | Y ) � 11

Solution • Find di ff erent parameterization of the model, such that the probabilistic model is not changed Outline of procedure WW T = U Σ V T ( U Σ V T ) T = U Σ 2 U T • SVD of W • Fix coordinate system V = I • Specify correct prior p ( U , Σ ) • Sample from p ( U , Σ | Y ) is Wishart distributed WW T W ∼ 𝒪 ( 0 , I ) → U ∼ ? is Wishart distributed U ΣΣ T U T → Σ ∼ ? � 12

Σ ∼ ? → U ΣΣ T U T Wishart U ∼ ? Theory • Since U , Σ is SVD of W and U , Σ 2 is eigenvalue decomposition of WW T → U is eigenvector matrix 𝒲 Q , D = { U ∈ ℝ D × Q | U T U = I } Stiefel manifold U ∈ 𝒲 Q , D Eigenvectors of Wishart matrix are distributed uniformly in space of orthogonal matrices ( Blai (2007), Uhlig (1994) ) → U is uniformly distributed on the Stiefel manifold � 13

Σ ∼ ? → U ΣΣ T U T wishart U ∼ ? Theory • Since U , Σ is SVD of W and U , Σ 2 is eigenvalue decomposition of WW T → U is eigenvector matrix 𝒲 Q , D = { U ∈ ℝ D × Q | U T U = I } Stiefel manifold U ∈ 𝒲 Q , D Eigenvectors of Wishart matrix are distributed uniformly in space of orthogonal matrices ( Blai (2007), Uhlig (1994) ) → U is uniformly distributed on the Stiefel manifold • Square of ordered eigenvalue matrix Σ is distributed as (James & Lee (2014)) q ′ � = q +1 λ q − λ q ′ � ) 2 ∑ Q q =1 ( λ p ( λ ) = ce − 1 D − Q − 1 q =1 λ q ∏ Q ∏ Q 2 q Q Q Q p ( σ 1 , …, σ Q ) = ce − 1 2 ∑ Q q =1 σ 2 ∏ ∏ ∏ σ D − Q − 1 σ 2 q − σ 2 q 2 σ q q q ′ � q ′ � = q +1 q =1 q =1 � 14

Implementation U ∼ uniform on Stiefel 𝒲 Q , D • Need: Σ ∼ p ( Σ ) ← easy, since we know the analytic exp for density How to uniformly sample U on 𝒲 Q , D for n = D : 1 v n ∼ uniform on 𝕋 n − 1 v n + sgn ( v n 1 ) v n e 1 u n = v n + sgn ( v n 1 ) v n e 1 H n ( v n ) = − sgn ( v n 1 ) ( I − 2 u n u T ˜ n ) H n = ( H n ) I 0 ˜ 0 Mezzadri (2007) U = H D ( v D ) H D − 1 ( v D − 1 ) … H 1 ( v 1 ) � 15

Implementation The full generative model for Bayesian PPCA: v D , …, v D − Q +1 ∼ 𝒪 (0, I ) σ ∼ p ( σ ) μ ∼ p ( μ ) Q H D − q +1 ( v D − q +1 ) ∏ U = q =1 Σ = diag( σ ) W = U Σ σ noise ∼ p ( σ noise ) N 𝒪 ( Y n ,: | μ , WW T + σ 2 noise I ) ∏ Y ∼ n =1 � 16

Results Synthetic Dataset • Construction X ∼ 𝒪 ( 0 , I ) ∈ ℝ N × Q U ∼ uniform on Stiefel 𝒲 Q , D ( N , D , Q ) = (150,5,2) ϵ ∼ 𝒪 (0, 0.01) ∈ ℝ N × D Σ = diag ( σ 1 , σ 2 ) = diag (3.0, 1.0) W = U Σ ∈ ℝ D × Q Y = XW T + ϵ • Inference 700 saPples froP p ( σ | Y ) σ - classical 3CA 2 2 600 σ - 7rue values 500 1 1 400 W 2 W 2 0 0 300 −1 200 −1 100 −2 −2 0 −2 −1 0 1 2 −0.5 0.0 0.5 1.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 W 1 W 1 σ � 17

Results Breast Cancer Wisconsin Dataset ( N , D ) = (569, 30) • Bayesian PCA 1.0 1.0 0.8 0.5 0.6 W 2 W 2 0.0 0.4 −0.5 0.2 −1.0 0.0 −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 W 1 W 1 • Advantages • Breaks the rotation symmetry without changing the probabilistic model • Enrichment of the classical PCA solution with uncertainty estimates • Decomposition of prior into rotation and principle variances • Allows to construct other priors without issues • Sparsity prior on principle variances without a-priori rotation preference • If desired a-priori rotation preference without a ff ecting the variances � 18

Extension to non-linear models • GPLVM with the same rotation invariant problem standard - chain: 1 standard - chain: 2 standard - chain: 3 5.0 d =1 𝒪 ( Y :, d | μ , K + σ 2 I ) p ( Y | X ) = ∏ D 2.5 X 2 0.0 i ,: X j ,: = k ( X i ,: , X j ,: ) −2.5 K = XX T , K ij = X T −5.0 −5 0 5 −5 0 5 −5 0 5 X 1 X 1 X 1 SE exp ( − 0.5 2 / l 2 ) 2 unique - chain: 1 unique - chain: 2 unique - chain: 3 k SE ( x , x ′ � ) = σ 2 x − x ′ � 5.0 2.5 X 2 0.0 −2.5 −5.0 −2 0 2 −2 0 2 −2 0 2 X 1 X 1 X 1 • No rotation symmetry in the posterior for the suggested parameterization • Di ff erent chains converge to di ff erent solutions due to increased model complexity � 19

Conclusion • Suggested new parameterization for W in PPCA, which uniquely identifies principle components even though the likelihood and the posterior are rotationally symmetric • Showed how to set the prior on the new parameters such that the model is not changed compared to a standard Gaussian prior on W • Provided an e ffi cient implementation via Householder transformations (no Jacobian correction needed) • New parameterization allows for other interpretable priors on rotation and principle variances • Extended to non-linear models and successfully solved the rotation problem there as well � 20

Poster session: #235 Github: https://github.com/RSNirwan/HouseholderBPCA Thanks for your attention! Supervisor: Prof. Dr. Nils Bertschinger Funder: Dr. h. c. Helmut O. Maucher

Rotation Invariant Householder Parameterization for Bayesian PCA - PowerPoint PPT Presentation

Rotation Invariant Householder Parameterization for Bayesian PCA Rajbir-Singh Nirwan, Nils Bertschinger June 11, 2019 Outline Probabilistic PCA (PPCA) Non-identifiability issue of PPCA Conceptual solution to the problem

Householder Response to the Householder Response to the Earned I ncome Tax Credit: Earned I

Lecture 10 Householder Triangularization NLA Reading Group Spring 13 by Onur Gngr

Parameterization-Aware MIP-Mapping Josiah Manson and Scott Schaefer Texas A&M University

Ro Rotation Im Impact on on W. W.Wheat Ro Rotation Im Impact on on W. W.Wheat Dak Dakota Lak

Experiment 3 Optical Rotation Optical rotation or optical activity The rotation of the plane

Multi-scale Detection 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Properties

Moment r F r d F Translation Translation + Rotation r This rotation tendency is known

AST 1420 Galactic Structure and Dynamics Today: galactic rotation Brief overview of

Rotation and Orientation: Fundamentals Perelyaev Sergei VARNA, 2011 What is Rotation ? Not

Rotation of a Rigid Body Topics: Rotational Motion Rotation About the Center of Mass

A Parameterization Method for Computing Normally Hyperbolic Invariant Tori Some Numerical

Outline Last time: local invariant features, scale invariant detection Lecture 14:

Invariant Variational Calculus Irina Kogan North Carolina State University & IMA December

T-duality Invariant Formalisms at the Quantum Level Daniel Thompson Queen Mary University of

2011/12 Income Forecast Q1 Q2 Q3 Q4 Total Householder Loan 18.0k 16.0k 17.0k 18k

Computer Graphics Remeshing Parameterization Convex mapping Texture mapping and

Performance Management 2016/17: Quarter 4 Education & Childrens Services Committee Director

Janice Toben, M.Ed What is SEL? Janice Toben, M.Ed is an educational Social and Emotional

Diet Supplements No strong evidence for use of supplements in concussion management at this

Navigating FAFSA and the EITC: Amanda Grover , Goodwill Industries of How to Connect West

WELCOME! BOARD MEETING | N OVEMBER 18 2016 GENERAL SESSION | CALL TO ORDER BOARD MEETING | N

Rhode Island Stormwater Design and Installations Standards Manual Public Workshop Design

Stability Analysis For Topic Models Dr. Derek Greene Insight @ UCD Motivation Key

paranormal interactivity led mec heht gewyrcan led ordered me to be made Hello

Sambuz

Useful Links

Newsletter

Mail Us

Rotation Invariant Householder Parameterization for Bayesian PCA - PowerPoint PPT Presentation

Rotation Invariant Householder Parameterization for Bayesian PCA Rajbir-Singh Nirwan, Nils Bertschinger June 11, 2019 Outline Probabilistic PCA (PPCA) Non-identifiability issue of PPCA Conceptual solution to the problem

Householder Response to the Householder Response to the Earned I ncome Tax Credit: Earned I

Lecture 10 Householder Triangularization NLA Reading Group Spring 13 by Onur Gngr

Parameterization-Aware MIP-Mapping Josiah Manson and Scott Schaefer Texas A&amp;M University

Ro Rotation Im Impact on on W. W.Wheat Ro Rotation Im Impact on on W. W.Wheat Dak Dakota Lak

Experiment 3 Optical Rotation Optical rotation or optical activity The rotation of the plane

Multi-scale Detection 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Properties

Moment r F r d F Translation Translation + Rotation r This rotation tendency is known

AST 1420 Galactic Structure and Dynamics Today: galactic rotation Brief overview of

Rotation and Orientation: Fundamentals Perelyaev Sergei VARNA, 2011 What is Rotation ? Not

Rotation of a Rigid Body Topics: Rotational Motion Rotation About the Center of Mass

A Parameterization Method for Computing Normally Hyperbolic Invariant Tori Some Numerical

Outline Last time: local invariant features, scale invariant detection Lecture 14:

Invariant Variational Calculus Irina Kogan North Carolina State University &amp; IMA December

T-duality Invariant Formalisms at the Quantum Level Daniel Thompson Queen Mary University of

2011/12 Income Forecast Q1 Q2 Q3 Q4 Total Householder Loan 18.0k 16.0k 17.0k 18k

Computer Graphics Remeshing Parameterization Convex mapping Texture mapping and

Performance Management 2016/17: Quarter 4 Education &amp; Childrens Services Committee Director

Janice Toben, M.Ed What is SEL? Janice Toben, M.Ed is an educational Social and Emotional

Diet Supplements No strong evidence for use of supplements in concussion management at this

Navigating FAFSA and the EITC: Amanda Grover , Goodwill Industries of How to Connect West

WELCOME! BOARD MEETING | N OVEMBER 18 2016 GENERAL SESSION | CALL TO ORDER BOARD MEETING | N

Rhode Island Stormwater Design and Installations Standards Manual Public Workshop Design

Stability Analysis For Topic Models Dr. Derek Greene Insight @ UCD Motivation Key

paranormal interactivity led mec heht gewyrcan led ordered me to be made Hello

Sambuz

Useful Links

Newsletter

Mail Us

Parameterization-Aware MIP-Mapping Josiah Manson and Scott Schaefer Texas A&M University

Invariant Variational Calculus Irina Kogan North Carolina State University & IMA December

Performance Management 2016/17: Quarter 4 Education & Childrens Services Committee Director