An Introduction to Tensor-Based Independent Component Analysis - PowerPoint PPT Presentation

L. De Lathauwer An Introduction to Tensor-Based Independent Component Analysis Lieven De Lathauwer K.U.Leuven Belgium Lieven.DeLathauwer@kuleuven-kortrijk.be 1

L. De Lathauwer Overview • Problem definition • Higher-order statistics • Basic ICA equations • Specific prewhitening-based multilinear algorithms • Application • Higher-order-only schemes • Variants for coloured sources • Dimensionality reduction • Conclusions 2

L. De Lathauwer Independent Component Analysis (ICA) Model: Y = M X + N ( P × 1) ( P × R )( R × 1) ( P × 1) x 2 x 1 x 3 + x 1 ˆ x 2 ˆ x 3 ˆ 3

L. De Lathauwer Model: Y = M X + N ( P × 1) ( P × R )( R × 1) ( P × 1) Assumptions: • columns of M are linearly independent • components of X are statistically independent Goal: Identification of M and/or reconstruction of X while observing only Y 4

L. De Lathauwer Independent Component Analysis (ICA) Disciplines: statistics, neural networks, information theory, linear and multilinear algebra , . . . Indeterminacies: ordering and scaling of the columns ( Y = M X ) Uncorrelated vs independent: X , Y are uncorrelated iff E { XY } = 0 X , Y are independent iff p XY ( x, y ) = p X ( x ) p Y ( y ) statistical independence implies: - the variables are uncorrelated - additional conditions on the HOS 5

L. De Lathauwer Algebraic tools: Condition Identification Tool X i uncorr. column space M matrix EVD/SVD X i indep. M tensor EVD/SVD Web site: http://www.tsi.enst.fr/icacentral/index.html mailing list, data sets, software 6

L. De Lathauwer Applications • Speech and audio • Image processing feature extraction, image reconstruction, video • Telecommunications OFDM, CDMA, . . . • Biomedical applications functional Magnetic Resonance Imaging, electromyogram, electro-encephalogram, (fetal) electrocardiogram, mammography, pulse oximetry, (fetal) magnetocardiogram, . . . • Other applications text classification, vibratory signals generated by termites (!), electron energy loss spectra, astrophysics, . . . 7

L. De Lathauwer HOS definitions Moments and cumulants of a random variable: Moments Cumulants m X c X 1 = E { X } 1 = E { X } “mean” ( m X ) “mean” m X 2 = E { X 2 } c X 2 = E { ( X − m X ) 2 } “variance” ( σ 2 ( R X ) X ) m X 3 = E { X 3 } c X 3 = E { ( X − m X ) 3 } m X 4 = E { X 4 } c X 4 = E { ( X − m X ) 4 } − 3 σ 4 X 8

L. De Lathauwer Characteristic Functions First characteristic function: + ∞ � def = E { e jωx } = p x ( x ) e jωx dx Φ x ( ω ) −∞ Generates moments: ∞ ( jω ) k m X � Φ x ( ω ) = ( m 0 = 1) k k ! k =0 Second characteristic function: def Ψ x ( ω ) = ln Φ x ( ω ) Generates cumulants: ∞ ( jω ) k c X � Ψ x ( ω ) = k k ! k =1 9

L. De Lathauwer Moments and cumulants of a set of random variables: Moments: def ( M ( N ) ) i 1 i 2 ...iN = Mom ( x i 1 , x i 2 , . . . , x iN ) = E { x i 1 x i 2 . . . x iN } x Cumulants: def ( c x ) i = Cum ( x i ) = E { x i } def ( C x ) i 1 i 2 = Cum ( x i 1 , x i 2 ) = E { x i 1 x i 2 } def ( C (3) x ) i 1 i 2 i 3 = Cum ( x i 1 , x i 2 , x i 3 ) = E { x i 1 x i 2 x i 3 } def ( C (4) x ) i 1 i 2 i 3 i 4 = Cum ( x i 1 , x i 2 , x i 3 , x i 4 ) = E { x i 1 x i 2 x i 3 x i 4 } − E { x i 1 x i 2 } E { x i 3 x i 4 } − E { x i 1 x i 3 } E { x i 2 x i 4 } − E { x i 1 x i 4 } E { x i 2 x i 3 } Order � 2 : x i ← x i − E { x i } 10

L. De Lathauwer Multivariate case: e.g. moments: X = E { } R X X X X = E { } M X X 3 11

L. De Lathauwer def  1 : m X = E { X }    → vector         def  E { XX T } 2 : =  R X     → matrix      = ⇒ def M X 3 : = E { X ◦ X ◦ X }  3   → 3rd order tensor          def M X  4 : = E { X ◦ X ◦ X ◦ X }   4   → 4th order tensor      12

L. De Lathauwer HOS example Gaussian distribution 2 πσ exp( − x 2 1 p x ( x ) = 2 σ 2 ) √ m ( n ) c ( n ) n p x ( x ) x x 1 0 0 σ 2 σ 2 2 3 0 0 3 σ 4 x 4 0 Uniform distribution 1 p x ( x ) = ( x ∈ [ − a, + a ]) 2 a m ( n ) c ( n ) n p x ( x ) x x 1 1 0 0 2 a a 2 / 3 a 2 / 3 2 3 0 0 3 a 4 / 5 − 2 a 4 / 15 − a + ax 4 13

L. De Lathauwer ICA: basic equations Model: Y = M X Second order: C Y E { Y Y T } = 2 M · C X 2 · M T = C X = 2 • 1 M • 2 M uncorrelated sources: C X 2 is diagonal “diagonalization by congruence” σ 2 σ 2 σ 2 1 2 R M 1 M 2 M R = + + . . . + C Y 2 M 1 M 2 M R 14

L. De Lathauwer Higher order: C Y 4 = C X 4 • 1 M • 2 M • 3 M • 4 M independent sources: C X is diagonal 4 “CANDECOMP / PARAFAC” M 1 M 2 M R λ 1 λ 2 λ R M 1 M 2 M R = + . . . + + C Y M 1 M 2 M R 15

L. De Lathauwer Prewhitening-based computation Model: Y = M X Second order: C Y E { Y Y T } = 2 M · C X 2 · M T = M · I · M T = M · M T ⇒ ( M · Q ) · ( M · Q ) T = “square root”: EVD, Cholesky, . . . Remark: PCA: M = U · S · V T SVD of M : ( US ) · ( US ) T = U · S 2 · U T ⇒ C Y = 2 16

L. De Lathauwer Prewhitening-based computation (2) Matrix factorization: M = T · Q Second order: C Y 2 = C X 2 • 1 M • 2 M = T · T T Whitened r.v. Z = T − 1 Y = Q X Observed r.v. Y = M X Higher order: ICA: C Y C X = 4 • 1 M • 2 M • 3 M • 4 M 4 ⇒ C Z C X = 4 • 1 Q • 2 Q • 3 Q • 4 Q 4 “multilinear symmetric EVD” “CANDECOMP/PARAFAC with orthogonality and symmetry constraints” Source cumulant is theoretically diagonal An arbitrary symmetric tensor cannot be diagonalized ⇒ different solution strategies 17

L. De Lathauwer PCA versus ICA ICA = higher-order fine-tuning of PCA: PCA ICA 2nd-order higher-order matrix EVD tensor EVD uncorrelated sources independent sources column space M M itself always possible depends on context Computational cost: cumulant estimation and diagonalization 18

L. De Lathauwer Illustration 3 2 1 0 −1 −2 −3 0 50 100 150 200 250 300 350 400 2 1 0 −1 −2 0 50 100 150 200 250 300 350 400 Observations 19

L. De Lathauwer 0.1 0.05 0 −0.05 −0.1 0 50 100 150 200 250 300 350 400 0.1 0.05 0 −0.05 −0.1 0 50 100 150 200 250 300 350 400 Sources estimated with PCA 3 2 1 0 −1 −2 −3 0 50 100 150 200 250 300 350 400 1.5 1 0.5 0 −0.5 −1 −1.5 0 50 100 150 200 250 300 350 400 Sources estimated with ICA 20

L. De Lathauwer Algorithm 1: maximal diagonality x x x Q x Q Q = x x x x x x x x C ( k +1) C ( k ) 21

L. De Lathauwer • Maximize energy on the diagonal by Jacobi-iteration • Determination of optimal rotation angle: order 3 real roots polynomial degree 2 order 3 complex roots polynomial degree 3 order 4 real roots polynomial degree 4 order 4 complex - [ Comon ’94, De Lathauwer ’01 ] 22

L. De Lathauwer Algorithm 2: maximal diagonality x x x Q x Q Q = x x x x x x x x C ( k +1) C ( k ) • Trace is not rotation invariant • Maximize sum of diagonal entries by Jacobi-iteration • Determination of optimal rotation angle: order 4 real roots polynomial degree 2 order 4 complex roots polynomial degree 3 [ Comon, Moreau, ’97 ] 23

L. De Lathauwer Algorithm 3: simultaneous EVD Q 1 Q 2 Q P Q 1 Q 2 Q P = + . . . + + C Z Q 1 Q 2 Q P = • Maximize energy on the diagonals by Jacobi-iteration • Determination of optimal rotation angle: real roots polynomial degree 2 complex roots polynomial degree 3 [ Cardoso ’94 (JADE) ] 24

L. De Lathauwer Application: fetal electrocardiogram extraction Abdominal and thoracic recordings 1000 0 −1000 2000 0 −2000 1000 0 −1000 500 0 −500 2000 0 −2000 5000 0 −5000 5000 0 −5000 5000 0 −5000 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 [s] 25

L. De Lathauwer ICA results for FECG extraction Independent components: 0.2 0 −0.2 0.2 0 −0.2 0.2 0 −0.2 0.1 0 −0.1 0.2 0 −0.2 0.2 0 −0.2 0.1 0 −0.1 0.2 0 −0.2 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 [s] 26

L. De Lathauwer A variant for coloured sources Condition: sources mutually uncorrelated, but individually correlated in time Basic equations: C Y E { Y ( t ) Y ( t ) T } 2 (0) = M · C X 2 (0) · M T = σ 2 σ 2 σ 2 1 2 R M 1 M 2 M R = + + . . . + C Y 2 (0) M 1 M 2 M R 27

L. De Lathauwer C Y E { Y ( t ) Y ( t + τ ) T } 2 ( τ ) = M · C X 2 ( τ ) · M T = = Variants: nonstationary sources, time-frequency representations, Hessian second characteristic function, . . . [ Belouchrani et al. ’97 (SOBI) ], [ De Lathauwer and Castaing ’08 ] (overcomplete) 28

L. De Lathauwer Large mixtures: more sensors than sources Applications: EEG, MEG, NMR, hyper-spectral image processing, data analysis, . . . Prewhitening-based algorithms: Y = M X ( P ≫ R ) ( P × 1) ( P × R )( R × 1) U · S · V T = M ( P × R ) ( P × R )( R × R )( R × R ) S − 1 · U T Y Z = V T X Z = ( R × 1) ( R × R )( R × 1) 29

L. De Lathauwer Large mixtures: more sensors than sources (2) Algorithms without prewhitening: best multilinear rank approximation U (3) I 3 I 3 I 2 = I 2 I 1 I 1 U (2) U (1) A S Tucker decomposition: [ Tucker ’64 ], [ De Lathauwer ’00 ] 30

An Introduction to Tensor-Based Independent Component Analysis - PowerPoint PPT Presentation

L. De Lathauwer An Introduction to Tensor-Based Independent Component Analysis Lieven De Lathauwer K.U.Leuven Belgium Lieven.DeLathauwer@kuleuven-kortrijk.be 1 L. De Lathauwer Overview Problem definition Higher-order statistics

8. Tensor Field Visualization Tensor: extension of concept of scalar and vector Tensor data

TENSOR ALGEBRA Continuum Mechanics Course (MMC) - ETSECCPB - UPC Introduction to Tensors Tensor

Tensor Field Techniques Lecture 11 March 5, 2020 Outline Basics of tensor algebra Tensor

(Some) Challenges in (Some) Challenges in Tensor Mining Tensor Mining Evrim Acar Sandia

Tensor-Matrix Products with a Compressed Sparse Tensor Shaden Smith George Karypis University

Tensor Field Visualization 9-1 Ronald Peikert SciVis 2007 - Tensor Fields Tensors

Independent Component Analysis Aleix M. Martinez aleix@ece.osu.edu Independent Component

Introduction to Machine Learning 10701 Independent Component Analysis Barnabs Pczos &

Introduction to Machine Learning CMU-10701 20. Independent Component Analysis Barnabs Pczos

Higher order black holes of scalar tensor theories E Babichev and CC gr-qc/1312.3204 CC, T

Hairy black holes in scalar tensor theories E Babichev and CC gr-qc/1312.3204 CC, T Kolyvaris, E

PROGRAMMING TENSOR CORES: NATIVE VOLTA TENSOR CORES WITH CUTLASS Andrew Kerr, Timmy Liu, Mostafa

TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS Cris Cecka Senior Research Scientist,

Tensor Methods for Signal Processing and Machine Learning Qibin Zhao Tensor Learning Unit RIKEN

and You Tensor network methods Matrix product states (MPS) Projected Entangled Pair States

Renormalization of Tensor Network States II. RG of Tensor Network States Tao Xiang Institute of

ECG and Activity Monitoring: what can we learn? Maggie Delano maggied@mit.edu @maggied

Brows Eyes Cheeks Lips

Playing with the microstructure of polymer systems for self-assembling and application in drug

AI applications for analysis ofmulti Omics data for identification of personalized driver

Portable ECG system design using the AD8232 microchip and open-source platform Miguel

Guiding Principles Prof Graham Ellis @grahamellis247 #HatHscot Improvement Hub Enabling health

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

by neurophysiological signals 1 G. Borghini L. Napoletano J.P. Imbert P. Aric M. Terenzi

An Introduction to Tensor-Based Independent Component Analysis - PowerPoint PPT Presentation

L. De Lathauwer An Introduction to Tensor-Based Independent Component Analysis Lieven De Lathauwer K.U.Leuven Belgium Lieven.DeLathauwer@kuleuven-kortrijk.be 1 L. De Lathauwer Overview Problem definition Higher-order statistics

8. Tensor Field Visualization Tensor: extension of concept of scalar and vector Tensor data

TENSOR ALGEBRA Continuum Mechanics Course (MMC) - ETSECCPB - UPC Introduction to Tensors Tensor

Tensor Field Techniques Lecture 11 March 5, 2020 Outline Basics of tensor algebra Tensor

(Some) Challenges in (Some) Challenges in Tensor Mining Tensor Mining Evrim Acar Sandia

Tensor-Matrix Products with a Compressed Sparse Tensor Shaden Smith George Karypis University

Tensor Field Visualization 9-1 Ronald Peikert SciVis 2007 - Tensor Fields Tensors

Independent Component Analysis Aleix M. Martinez aleix@ece.osu.edu Independent Component

Introduction to Machine Learning 10701 Independent Component Analysis Barnabs Pczos &amp;

Introduction to Machine Learning CMU-10701 20. Independent Component Analysis Barnabs Pczos

Higher order black holes of scalar tensor theories E Babichev and CC gr-qc/1312.3204 CC, T

Hairy black holes in scalar tensor theories E Babichev and CC gr-qc/1312.3204 CC, T Kolyvaris, E

PROGRAMMING TENSOR CORES: NATIVE VOLTA TENSOR CORES WITH CUTLASS Andrew Kerr, Timmy Liu, Mostafa

TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS Cris Cecka Senior Research Scientist,

Tensor Methods for Signal Processing and Machine Learning Qibin Zhao Tensor Learning Unit RIKEN

and You Tensor network methods Matrix product states (MPS) Projected Entangled Pair States

Renormalization of Tensor Network States II. RG of Tensor Network States Tao Xiang Institute of

ECG and Activity Monitoring: what can we learn? Maggie Delano maggied@mit.edu @maggied

Brows Eyes Cheeks Lips

Playing with the microstructure of polymer systems for self-assembling and application in drug

AI applications for analysis ofmulti Omics data for identification of personalized driver

Portable ECG system design using the AD8232 microchip and open-source platform Miguel

Guiding Principles Prof Graham Ellis @grahamellis247 #HatHscot Improvement Hub Enabling health

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

by neurophysiological signals 1 G. Borghini L. Napoletano J.P. Imbert P. Aric M. Terenzi

Introduction to Machine Learning 10701 Independent Component Analysis Barnabs Pczos &