tensor canonical correlation analysis and it its
play

Tensor Canonical Correlation Analysis and It Its applications - PowerPoint PPT Presentation

Tensor Canonical Correlation Analysis and It Its applications Presenter: Yong LUO The work is done when Yong LUO was a Research Fellow at Nanyang Technological University, Singapore Outline Y. Luo, D. C. Tao, R. Kotagiri, C. Xu, and Y. G.


  1. Tensor Canonical Correlation Analysis and It Its applications Presenter: Yong LUO The work is done when Yong LUO was a Research Fellow at Nanyang Technological University, Singapore

  2. Outline • Y. Luo, D. C. Tao, R. Kotagiri, C. Xu, and Y. G. Wen, “Tensor Canonical Correlation Analysis for Multi - view Dimension Reduction,” IEEE Transactions on Knowledge and Data Engineering (T-KDE) , vol. 27, no. 11, pp. 3111-3124, 2015. • Y. Luo, Y. G. Wen and D. C. Tao, “On Combining Side Information and Unlabeled Data for Heterogeneous Multi- task Metric Learning,” International Joint Conference on Artificial Intelligence (IJCAI) , pp. 1809-1815, 2016.

  3. Mult lti-view dimension reduction (MVDR) • Dimension reduction (DR) • Find a low-dimensional representation for high dimensional data • Benefits: reduce the chance of over-fitting, reduce computational cost, etc. • Approaches: feature selection (IG, MI, sparse learning, etc.), feature transformation (PCA, LDA, LE, etc.)

  4. MVDR • Real world objects usually contain information from multiple sources, and can be extracted different kinds of features. • Traditional DR methods cannot effectively handle multiple types of features Feature concatenation

  5. MVDR • Multi-view learning • Learn to fuse multiple distinct feature representations • Families: weighted view combination, multi-view dimension reduction, view agreement exploration • Multi-view dimension reduction • Multi-view feature selection • Multi-view subspace learning: seek a low-dimensional common subspace to compactly represent the heterogeneous data; One of the most representative model: CCA

  6. Canonical correla lation a analysis (CCA) • Objective of CCA • Correlation maximization on the common subspace 𝑦 1 𝑈 𝐢 1 𝑨 1𝑜 = 𝐲 1𝑜 𝑨 𝑈 𝐢 2 𝑨 2𝑜 = 𝐲 2𝑜 𝑦 2 𝑈 𝐷 12 𝐢 2 𝐢 1 argmax 𝜍 = corr 𝐴 1 , 𝐴 2 = 𝑈 𝐷 11 𝐢 1 𝐢 2 𝑈 𝐷 22 𝐢 2 𝐢 1 𝐴 1 ,𝐴 2 H. Hotelling , “Relations between two sets of variants,” Biometrika, 1936. D. P. Foster, et al., “Multi - view dimensionality reduction via canonical correlation analysis,” Tech. Rep., 2008.

  7. Generalizatio ions of CCA to several l vie iews • CCA-MAXVAR • Generalizes CCA to 𝑁 ≥ 2 views 𝑁 1 2 , argmin 𝑁 ෍ 𝐴 − 𝛽 𝑛 𝐴 𝑛 s. t. 𝐴 𝑛 2 = 1 2 𝑁 𝐴,𝐛, 𝐢 𝑞 𝑛=1 𝑛=1 𝑈 𝐢 𝑛 is the vector of canonical variables for the • 𝐴 𝑛 = 𝑌 𝑛 𝑛 ’ th view, and 𝐴 is a centroid representation • Solutions can be obtained using the SVD of 𝑌 𝑛 J. R. Kettenring , “Canonical analysis of several sets of variables,” Biometrika, 1971.

  8. Generalizatio ions of CCA to several l vie iews • CCA-LS 𝑁 1 2 𝑈 𝐢 𝑞 − 𝑌 𝑟 𝑈 𝐢 𝑟 argmin ෍ 𝑌 𝑞 2𝑁 𝑁 − 1 2 𝑁 𝐢 𝑛 𝑛=1 𝑞,𝑟=1 𝑁 s. t. 1 𝑈 𝐷 𝑛𝑛 𝐢 𝑛 = 1 𝑛 ෍ 𝐢 𝑛 𝑛=1 • Equivalent to CCA-MAXVAR, but can be solved efficiently and adaptively based on LS regression J. Via et al., “A learning algorithm for adaptive canonical correlation analysis of several data sets,” Neural Networks, 2007.

  9. The proposed TCCA framework • Main drawback of CCA-MAXVAR and CCA-LS • Only the statistics (correlation information) between pairs of features is explored, while high-order statistics is ignored • Tensor CCA • Directly maximize the high-order correlation between all views 𝑦 3 𝑦 3 𝑒 3 𝑒 3 𝑦 1 𝑦 1 𝑒 2 𝑒 2 𝑒 1 𝑒 1 𝑦 2 𝑦 2 Pairwise correlation High order tensor correlation

  10. The proposed TCCA framework for MVDR 𝑂 Tensor CCA LAB 𝑒 1 ⋮ 𝑌 1 𝑦 3 1 𝑠 𝐯 3 𝐯 3 𝑒 3 𝑦 1 1 𝑠 𝜇 1 𝜇 𝑠 + ⋯ ⋯ + 𝐯 1 𝐯 1 WT 𝑒 2 𝒟 123 𝑒 2 ⋮ 𝑌 2 𝑒 1 1 𝑠 𝐯 2 𝐯 2 𝑦 2 𝑒 3 Covariance tensor SIFT ⋮ 𝑌 3 Sum of rank-1 approximation Mapping ⋯ 𝑉 1 𝑠 𝑎 1 3𝑠 𝑉 2 𝑠 ⋮ 𝑎 2 𝑎 𝑉 3 ⋯ 𝑠 𝑎 3

  11. Tensor basic ics • Generalization of an n-dimensional array Scalar: order-0 tensor Vector: order-1 tensor Order-3 tensor Matrix: order-2 tensor

  12. Tensor basic ics • Tensor-matrix multiplication • The 𝑛 -mode product of an 𝐽 1 × 𝐽 2 × ⋯ × 𝐽 𝑁 tensor 𝒝 and an 𝐾 𝑛 × 𝐽 𝑛 matrix 𝑉 is a tensor ℬ = 𝒝 × 𝑛 𝑉 of size 𝐽 1 × ⋯ × 𝐽 𝑛−1 × 𝐾 𝑛 × 𝐽 𝑛+1 ⋯ × 𝐽 𝑁 with the element 𝐽 𝑛 ℬ 𝑗 1 , ⋯ , 𝑗 𝑛−1 , 𝑘 𝑛 , 𝑗 𝑛+1 , ⋯ , 𝑗 𝑁 = ෍ 𝒝 𝑗 1 , 𝑗 2 , ⋯ , 𝑗 𝑁 𝑉 𝑘 𝑛 , 𝑗 𝑛 𝑗 𝑛 =1 • The product of 𝒝 and a sequence of matrices ሼ𝑉 𝑛 ∈ ℬ = 𝒝 × 1 𝑉 1 × 2 𝑉 2 ⋯ × 𝑁 𝑉 𝑁

  13. Tensor basic ics • Tensor-vector multiplication • The contracted 𝑛 -mode product of 𝒝 and an 𝐽 𝑛 -vector 𝐯 is an 𝐽 1 × ⋯ × 𝐽 𝑛−1 × 𝐽 𝑛+1 ⋯ × 𝐽 𝑁 tensor ℬ = 𝒝 ഥ × 𝑛 𝐯 of order 𝑁 − 1 with the entry 𝐽 𝑛 ℬ 𝑗 1 , ⋯ , 𝑗 𝑛−1 , 𝑗 𝑛+1 , ⋯ , 𝑗 𝑁 = ෍ 𝒝 𝑗 1 , 𝑗 2 , ⋯ , 𝑗 𝑁 𝐯 𝑗 𝑛 𝑗 𝑛 =1 • Tensor-tensor multiplication • Outer product, contracted product, inner product • Frobenius norm of the tensor 𝐽 1 𝐽 2 𝐽 𝑁 2 = 𝒝, 𝒝 = ෍ 𝒝 𝑗 1 , 𝑗 2 , ⋯ , 𝑗 𝑁 2 𝒝 𝐺 ෍ ⋯ ෍ 𝑗 1 =1 𝑗 2 =1 𝑗 𝑁 =1

  14. Tensor basic ics • Matricization • The mode- 𝑛 matricization of 𝒝 is denoted as an 𝐽 𝑛 × 𝐽 1 ⋯ 𝐽 𝑛−1 𝐽 𝑛+1 ⋯ 𝐽 𝑁 matrix 𝐵 𝑛 row-wise vectorizing 𝐵 2 column-wise vectorizing 𝐵 1 mode-2 frontal matricizing mode-1 𝐵 3 𝒝 horizontal matricizing

  15. Tensor basic ics • Matricization property • The 𝑛 -mode multiplication ℬ = 𝒝 × 𝑛 𝑉 can be manipulated as matrix multiplication by storing the tensors in metricized form, i.e., 𝐶 𝑛 = 𝑉𝐵 𝑛 • The series of 𝑛 -mode product can be expressed as a series of Kronecker products ℬ = 𝒝 × 1 𝑉 1 × 2 𝑉 2 ⋯ × 𝑁 𝑉 𝑁 𝑈 𝐶 𝑛 = 𝑉 𝑛 𝐵 𝑛 𝑉 𝑑 𝑛−1 ⨂𝑉 𝑑 𝑛−1 ⨂ ⋯ ⨂𝑉 𝑑 𝑛−1 𝑑 1 , 𝑑 2 , ⋯ , 𝑑 𝐿 = 𝑛 + 1, 𝑛 + 2, ⋯ , 𝑁, 1, 2, ⋯ , 𝑛 − 1 is a forward cyclic ordering for indices of the tensor dims

  16. TCCA formulation • Optimization problem • Maximize the correlation between the canonical variables 𝑈 𝐢 𝑛 , 𝑛 = 1, ⋯ , 𝑁 : 𝐴 𝑛 = 𝑌 𝑛 𝜍 = corr 𝐴 1 , 𝐴 2 , ⋯ , 𝐴 𝑁 = 𝐴 1 ⨀𝐴 2 ⨀ ⋯ ⨀𝐴 𝑁 𝑈 𝐟 , argmax 𝐢 𝑛 𝑈 𝐴 𝑛 = 1, 𝑛 = 1, ⋯ , 𝑁 s. t. 𝐴 𝑛 • Equivalent formulation 𝑈 ഥ 𝑈 ⋯ ഥ 𝑈 , 𝜍 = 𝒟 12⋯𝑛 ഥ argmax × 1 𝐢 1 × 2 𝐢 2 × 𝑁 𝐢 𝑁 𝐢 𝑛 𝑈 𝐷 𝑛𝑛 + 𝜗𝐽 𝐢 𝑛 = 1, 𝑛 = 1, ⋯ , 𝑁 s. t. 𝐢 𝑛 1 𝑂 • Covariance tensor: 𝒟 12⋯𝑁 = 𝑂 σ 𝑜=1 𝐲 1𝑜 ∘ 𝐲 2𝑜 ∘ ⋯ ∘ 𝐲 𝑁𝑜

  17. TCCA formulation • Reformulation 1 2 × 2 ሚ 1 2 ⋯ × 𝑁 ሚ Τ Τ Τ 1 2 , and • Let ℳ = 𝒟 12⋯𝑁 × 1 ሚ 𝐷 11 𝐷 22 𝐷 𝑁𝑁 1 2 𝐢 𝑛 , where ሚ Τ 𝐯 𝑛 = ሚ 𝐷 𝑛𝑛 𝐷 𝑛𝑛 = 𝐷 𝑛𝑛 + 𝜗𝐽 𝑈 ഥ 𝑈 ⋯ ഥ 𝑈 , 𝜍 = ℳ ഥ argmax × 1 𝐯 1 × 2 𝐯 2 × 𝑁 𝐯 𝑁 𝐯 𝑛 𝑈 𝐯 𝑛 = 1, 𝑛 = 1, ⋯ , 𝑁 s. t. 𝐯 𝑛 • Main solution • If define ෡ ℳ = 𝜍𝐯 1 ∘ 𝐯 2 ∘ ⋯ ∘ 𝐯 𝑁 , problem becomes 2 , [Lathauwer et al., 2000a] ℳ − ෡ argmin ℳ 𝐺 𝐯 𝑛 • Solved by alternating least square (ALS), high-order power method (HOPM), etc. L. De Lathauwer et al., “On the best Rank -1 and rank-(r1, r2, ..., rn) approximation of higher- order tensors,” SIAM J. Matrix Anal. Appl., 2000.

  18. TCCA solu lution • Solutions • Remaining solutions: recursively maximizing the same correlation as presented in the main TCCA problem • All solutions: the best sum of rank- 1 approximation, i.e., rank- 𝑠 CP decomposition of ℳ 𝑠 𝑙 ∘ 𝐯 2 𝑙 ∘ ⋯ ∘ 𝐯 𝑁 𝑙 ෡ ℳ ≈ ෍ 𝜍 𝑙 𝐯 1 𝑙=1 • Projected data 1 , ⋯ , 𝐯 𝑛 𝑈 ሚ Τ 𝑠 −1 2 𝑉 𝑛 𝑉 𝑛 = 𝐯 𝑛 𝑎 𝑛 = 𝑌 𝑛 𝐷 𝑛𝑛

  19. KTCCA formulatio ion • Non-linear extension • Non-linear feature mapping ∅ : ∅ 𝑌 𝑛 = ∅ 𝐲 𝑛1 , ∅ 𝐲 𝑛2 , ⋯ , ∅ 𝐲 𝑛𝑂 • Canonical variables: 𝐴 𝑛 = ∅ 𝑈 𝑌 𝑛 𝐢 𝑛 • Representer theorem: 𝐢 𝑛 = ∅ 𝑌 𝑛 𝐛 𝑛 • Optimization problem 𝑈 ഥ 𝑈 ⋯ ഥ 𝑈 , 𝜍 = 𝒧 12⋯𝑁 ഥ argmax × 1 𝐛 1 × 2 𝐛 2 × 𝑁 𝐛 𝑁 𝐛 𝑛 𝑈 𝐿 𝑛𝑛 2 s. t. 𝐛 𝑛 + 𝜗𝐿 𝑛𝑛 𝐛 𝑛 = 1, 𝑛 = 1, ⋯ , 𝑁 𝑈 𝑀 𝑛 𝑀 𝑛

  20. KTCCA solution • Reformulation 1 2 × 2 ሚ 1 2 ⋯ × 𝑁 ሚ Τ Τ 1 2 , and 𝐜 𝑛 = Τ • Let 𝒯 = 𝒧 12⋯𝑛 × 1 ሚ 𝐷 11 𝐷 22 𝐷 𝑁𝑁 Τ 1 2 𝐛 𝑛 : ሚ 𝐷 𝑛𝑛 𝑈 ഥ 𝑈 ⋯ ഥ 𝑈 , 𝜍 = 𝒯 ഥ argmax × 1 𝐜 1 × 2 𝐜 2 × 𝑁 𝐜 𝑁 𝐯 𝑛 𝑈 𝐜 𝑛 = 1, 𝑛 = 1, ⋯ , 𝑁 s. t. 𝐜 𝑛 • Solved by ALS • Projected data: −1 𝐶 𝑛 , 𝑛 = 1, ⋯ , 𝑁 𝑎 𝑛 = 𝐿 𝑛𝑛 𝑀 𝑛

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend