Multi-View Representation Learning: Algorithms and Applications - - PowerPoint PPT Presentation
Multi-View Representation Learning: Algorithms and Applications - - PowerPoint PPT Presentation
Multi-View Representation Learning: Algorithms and Applications Changqing Zhang ( ) Tianjin University, China 2019-10-23 O u t l i n e 1. Background Multi-View Learning 2. Multi-View Subspace Representation 3. Multi-View
O u t l i n e
- 4. Applications
- 2. Multi-View Subspace Representation
- 3. Multi-View Complete Representation
- 1. Background:Multi-View Learning
- 5. Conclusion
Why Multi-View Learning?
Background:Multi-View Learning
Synthetic Multi-View Data Multi-View Data in Real World
Ground Truth View 1 View 3 View 2 Video Surveillance Medical Analysis Self-driving Car
Why Multi-View Representation Learning?
Background:Multi-View Learning
Diagnosis Representation Learning
- Application: Intelligence Medical Diagnosis
- Challenge: Multi-modal Integration
Medical Data
Multi-Modal Medical Data Analysis
Representation: The Key for Applications!
Why Multi-View Representation Learning?
CCA: Correlation Maximization!
Background:Multi-View Learning
CCA-based Multi-View Representation Learning CCA (1936)-> KCCA (2006)-> DCCA (2013)
High-order Multi-View Representation Learning
Self-expression-based Subspace Representation
Multi-View Subspace Representation
Multiple Subspaces Self-Reconstruction Subspace Representation
High-order Multi-View Representation Learning
Find the correlation in a global view!
Multi-View Subspace Representation
(v) (w)
corr( , )
v w
X X
(1) (V)
corr( ,..., ) X X
Pairwise correlation High-order correlation
1.What is high-order correlation? 2.What is the difference compared to pairwise manner?
[ICCV’15] Changqing Zhang, Huazhu Fu, Si Liu, Guangcan Liu, Xiaochun Cao, Low-Rank Tensor Constrained Multiview Subspace Clustering, ICCV 2015
High-order Multi-View Representation Learning
Multi-View Subspace Representation
[ICCV’15] Changqing Zhang, Huazhu Fu, Si Liu, Guangcan Liu, Xiaochun Cao, Low-Rank Tensor Constrained Multiview Subspace Clustering, ICCV 2015
Key observation: Self-representation matrices are aligned: (1) dimensionality and (2) semantic
High-order Correlation Subspace Representation
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Multi-view Features
...
High-order Multi-View Representation Learning
Multi-View Subspace Representation
[ICCV’15] Changqing Zhang, Huazhu Fu, Si Liu, Guangcan Liu, Xiaochun Cao, Low-Rank Tensor Constrained Multiview Subspace Clustering, ICCV 2015
How to define the rank of a 3-order tensor?
Unfolding for a 3-order tensor
Low-rank
High-order Multi-View Representation Learning
Multi-View Subspace Representation
[ICCV’15] Changqing Zhang, Huazhu Fu, Si Liu, Guangcan Liu, Xiaochun Cao, Low-Rank Tensor Constrained Multiview Subspace Clustering, ICCV 2015 [IJCV’18] Yuan Xie, Dacheng Tao, Wensheng Zhang, Yan Liu, Lei Zhang, Yanyun Qu, On Unifying Multi-View Self-Representation for Clustering by Tensor Multi-Rank Minimization, IJCV 2018
Modeling high-order correlation is effective!
Diversity-induced Multi-View Representation Learning
Multi-View Subspace Representation
[CVPR’15] Xiaochun Cao, Changqing Zhang*, Huazhu Fu, Si Liu, Hua Zhang, Diversity-induced Multiview Subspace Clustering, CVPR 2015
Which Group is better?
View-1 View-2 View-1 View-2
Diversity-induced Multi-View Representation Learning
Multi-View Subspace Representation
[CVPR’15] Xiaochun Cao, Changqing Zhang*, Huazhu Fu, Si Liu, Hua Zhang, Diversity-induced Multiview Subspace Clustering, CVPR 2015
- Independence maximization for complementarity
HSIC: Hilbert-Schmidt independence criterion
HSIC = 0.53, pho = 0.81 HSIC = 0.41, pho = 0 HSIC = 0.14, pho = 0 HSIC = 0, pho = 0
[1] Complex Correlation [2] Closed-form Solution
Complementarity->Diversity->Independence
Diversity-induced Multi-View Representation Learning
Multi-View Subspace Representation
[CVPR’15] Xiaochun Cao, Changqing Zhang*, Huazhu Fu, Si Liu, Hua Zhang, Diversity-induced Multiview Subspace Clustering, CVPR 2015
Ensemble learning-like: good & diversity in a better space
Make the voters diverse Better feature space Reconstruction in Latent Space Information Preservation in Latent Space Diversity Regularization Smooth Term
Diversity-induced Multi-View Representation Learning
Multi-View Subspace Representation
[CVPR’15] Xiaochun Cao, Changqing Zhang*, Huazhu Fu, Si Liu, Hua Zhang, Diversity-induced Multiview Subspace Clustering, CVPR 2015
Ablation Experiment for Diversity Term
Latent Multi-View Subspace Clustering
Multi-View Complete Representation
[CVPR’17/Spotlight] Changqing Zhang, Qinghua Hu, Huazhu Fu, Pengfei Zhu, Xiaochun Cao, Latent Multi-View Subspace Clustering, CVPR 2017.
An intuitive explanation
(v) 2 2
|| (x ) h ||
v v
f
(v) 2 2
|| g (h) x ||
v v
Typical: correlation maximization A flexible way
Generalized Latent Multi-View Subspace Learning
Multi-View Complete Representation
[TPAMI’18] Changqing Zhang, Huazhu Fu, Qinghua Hu, Xiaochun Cao, Yuan Xie, Dacheng Tao, Dong Xu, Generalized Latent Multi-View Subspace Clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2018.
- General Correlation
- Complete Representation
- Deep“Matrix
Factorization”
Degradation networks mimicking data transmitting
Degradation networks
Generalized Latent Multi-View Subspace Learning
Multi-View Complete Representation
[TPAMI’18] Changqing Zhang, Huazhu Fu, Qinghua Hu, Xiaochun Cao, Yuan Xie, Dacheng Tao, Dong Xu, Generalized Latent Multi-View Subspace Clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2018.
Degradation networks Subspace representation
Generalized Latent Multi-View Subspace Learning
Multi-View Complete Representation
[TPAMI’18] Changqing Zhang, Huazhu Fu, Qinghua Hu, Xiaochun Cao, Yuan Xie, Dacheng Tao, Dong Xu, Generalized Latent Multi-View Subspace Clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2018.
Performance by using each single view and multiple views
Effectiveness of gLMSC in integrating Multiple Views
CPM-Nets: Cross Partial Multi-View Networks
Multi-View Complete Representation
Challenges
- f Classification on Partial Multi-View Data
多视图数据的缺失情况比较复杂,如何避免预处理或者人工干预 (如:预先补全/数据丢弃/根据缺失情况分组)?
特征类型/模态种类多,不同样本缺失的模态不同(组合问题); 甚至存在test样本的缺失模式与所有training样本的不同;
如何在理论上保证信息利用的充分性?(信息的完备性) 如何使得分类器具有更好的泛化性(特别是小样本情况)?
CPM-Nets: Cross Partial Multi-View Networks
Multi-View Complete Representation
[NeurIPS’19/Spotlight] Changqing Zhang, Zongbo Han,Yajie Cui, Huazhu Fu, Joey Tianyi Zhou, Qinghua Hu, CPM-Nets: Cross Partial Multi-View Networks, Neural Information Processing Systems (NeurIPS) 2019.
- 1. Flexibility: Samples with arbitrary view-missing patterns;
- 2. Complete-Representation: Compact with full information;
- 3. Structured-Representation: Simplify classifier for
interpretability;
- 1. 自适应复杂缺失情况
- 2. 统一表示的信息完备性:理论保证
- 3. 统一表示结构化:简化分类器+可解释性
Our Algorithm for Classification on Partial Multi-View Data
CPM-Nets: Cross Partial Multi-View Networks
Multi-View Complete Representation
[NeurIPS’19/Spotlight] Changqing Zhang, Zongbo Han,Yajie Cui, Huazhu Fu, Joey Tianyi Zhou, Qinghua Hu, CPM-Nets: Cross Partial Multi-View Networks, Neural Information Processing Systems (NeurIPS) 2019.
Framework of CPM-Nets
反向编码: 保证信息完备 +自适应缺失 统一表示结构 化:简化分类 器+可解释性
CPM-Nets: Cross Partial Multi-View Networks
Multi-View Complete Representation
[NeurIPS’19/Spotlight] Changqing Zhang, Zongbo Han,Yajie Cui, Huazhu Fu, Joey Tianyi Zhou, Qinghua Hu, CPM-Nets: Cross Partial Multi-View Networks, Neural Information Processing Systems (NeurIPS) 2019.
Framework of CPM-Nets
信息完备性表示 结构化表示 所有观测到的视图 (partial views) 编码进统一表示 clustering-like监督 损失函数:统一表 示结构化、无参化
CPM-Nets: Cross Partial Multi-View Networks
Multi-View Complete Representation
[NeurIPS’19/Spotlight] Changqing Zhang, Zongbo Han,Yajie Cui, Huazhu Fu, Joey Tianyi Zhou, Qinghua Hu, CPM-Nets: Cross Partial Multi-View Networks, Neural Information Processing Systems (NeurIPS) 2019.
Theoretical Analysis
CPM-Nets: Cross Partial Multi-View Networks
Multi-View Complete Representation
[NeurIPS’19/Spotlight] Changqing Zhang, Zongbo Han,Yajie Cui, Huazhu Fu, Joey Tianyi Zhou, Qinghua Hu, CPM-Nets: Cross Partial Multi-View Networks, Neural Information Processing Systems (NeurIPS) 2019.
Comparison under Different Missing Rate
CCA-based methods: CCA/Kernelized CCA/Deep CCA; Matrix Factorization- based method: Deep MF; Metric Learning Methods: LMNN/ITML.
CPM-Nets: Cross Partial Multi-View Networks
Multi-View Complete Representation
[NeurIPS’19/Spotlight] Changqing Zhang, Zongbo Han,Yajie Cui, Huazhu Fu, Joey Tianyi Zhou, Qinghua Hu, CPM-Nets: Cross Partial Multi-View Networks, Neural Information Processing Systems (NeurIPS) 2019.
Comparison with Completion Methods
CRA (CVPR’17) [1]; Mean: Complete the missing values with the mean of the observed in the same class.
[1] Missing modalities imputation via cascaded residual autoencoder. CVPR, 2017.
CPM-Nets: Cross Partial Multi-View Networks
Multi-View Complete Representation
[NeurIPS’19/Spotlight] Changqing Zhang, Zongbo Han,Yajie Cui, Huazhu Fu, Joey Tianyi Zhou, Qinghua Hu, CPM-Nets: Cross Partial Multi-View Networks, Neural Information Processing Systems (NeurIPS) 2019.
Visualization under Missing Rate: η = 0.5
Unsupervised Case Supervised Case
[CVPR’19] Changqing Zhang, Yeqing Liu, Huazhu Fu, AE^2-Nets: Autoencoder in Autoencoder Networks, IEEE Conference on Computer Vision and Pattern Recognition (CVPR, Oral Paper) 2019.
- 1. 内层编码:保留单视图本征信息、降低冗余和噪声
- 2. 外层编码:融合各视图本征,确保统一表示质量
- 3. 协同视图内编码和多视图统一编码
Multi-View Complete Representation
AE^2-Nets: Autoencoder in Autoencoder Networks
[CVPR’19] Changqing Zhang, Yeqing Liu, Huazhu Fu, AE^2-Nets: Autoencoder in Autoencoder Networks, IEEE Conference on Computer Vision and Pattern Recognition (CVPR, Oral Paper) 2019.
Multi-View Complete Representation
AE^2-Nets: Autoencoder in Autoencoder Networks AE-nets: We optimize view-specific autoencoder networks, it aims to meet the constraint of self- reconstruction 𝑨𝑗
𝑁,𝑤 and make each
single view to be encodable;
ℎ𝑗
Degradation nets: We optimize each degradation network under the supervision of each single view 𝑨𝑗
𝑁 2,𝑤 ;
Latent representation h: We optimize latent representation to encode the information from multiple views.
ℒ𝑏𝑓
𝑤 = 1
2
𝑗=1 𝑜
𝑦𝑗
𝑤 − 𝑨𝑗 𝑁,𝑤 2
+ 𝜇 𝑨𝑗
𝑁 2,𝑤 − 𝑗 𝑀,𝑤 2
ℒdg
𝑤 = 1
2
𝑗=1 𝑜
𝑨𝑗
𝑁 2,𝑤 − 𝑗 𝑀,𝑤 2
ℒℎ
𝑤 = 1
2
𝑗=1 𝑜
𝑨𝑗
𝑁 2,𝑤 − 𝑗 𝑀,𝑤 2
𝑦𝑗
1
𝑦𝑗
2
𝑨𝑗
𝑁,1
𝑨𝑗
𝑁,2
𝑨𝑗
𝑁 2,1
𝑨𝑗
𝑁 2,2
𝑗
𝑀,1
𝑗
𝑀,2
AE^2-Nets: Autoencoder in Autoencoder Networks
- Fig. 1 Visualization of original and latent features on handwritten
- Fig. 3 Clustering performance comparison
- Fig. 4 Classification performance comparison
- Fig. 2 Visualization of original and latent features on Caltech101-7
[CVPR’19] Changqing Zhang, Yeqing Liu, Huazhu Fu, AE^2-Nets: Autoencoder in Autoencoder Networks, IEEE Conference on Computer Vision and Pattern Recognition (CVPR, Oral Paper) 2019.
Multi-View Complete Representation
AE^2-Nets: Autoencoder in Autoencoder Networks
compactness & completeness analysis empirical convergence
[CVPR’19] Changqing Zhang, Yeqing Liu, Huazhu Fu, AE^2-Nets: Autoencoder in Autoencoder Networks, IEEE Conference on Computer Vision and Pattern Recognition (CVPR, Oral Paper) 2019.
Multi-View Complete Representation
Infant Brain Development Prediction
Applications
[IEEE TMI’18] Changqing Zhang (张长青), Ehsan Adeli, Zhengwang Wu, Gang Li, Weili Lin, Dinggang Shen, Infant Brain Development Prediction with Latent Partial Multi-View Representation Learning, IEEE Transaction on Medical Imaging (TMI), 2018.
View-missing Small-Sample-Size Multi-View &Multi-Task
Infant Brain Development Prediction
Applications
[IEEE TMI’18] Changqing Zhang (张长青), Ehsan Adeli, Zhengwang Wu, Gang Li, Weili Lin, Dinggang Shen, Infant Brain Development Prediction with Latent Partial Multi-View Representation Learning, IEEE Transaction on Medical Imaging (TMI), 2018.
Flexible for View-Missing Joint Use All-Samples Joint Use All-Views
min
{𝑋,𝐼,𝑄𝑢} 𝑢=1 𝑈
𝑋𝐼 − 𝑍 1 + 𝛽
𝑢=1 𝑈
𝜕𝑢
2 𝒬𝑃𝑢(𝑄𝑢𝐼 − 𝑌𝑢) 2,1 + 𝛾 𝑋 ∗
𝑞𝑠𝑓𝑒𝑗𝑑𝑢𝑗𝑝𝑜 𝑓𝑠𝑠𝑝𝑠 𝑠𝑓𝑑𝑝𝑜𝑡𝑢𝑠𝑣𝑑𝑢𝑗𝑝𝑜 𝑓𝑠𝑠𝑝𝑠 𝑢𝑏𝑡𝑙 𝑑𝑝𝑠𝑠𝑓𝑚𝑏𝑢𝑗𝑝𝑜
𝑡. 𝑢.
𝑢=1 𝑈
𝜕𝑢 = 1, 𝜕𝑢 ≥ 0; 𝑄𝑢
𝑈𝑄𝑢 = 𝐽,
𝑢 = 1, … , 𝑈.
Video Face Clustering
Applications
[IEEE TIP’15] Xiaochun Cao, Changqing Zhang (张长青) *, Chengju Zhou, Huazhu Fu, and Hassan Foroosh, Video Face Clustering via Constrained Sparse Representation and Multi-View Spectral Clustering, IEEE Transactions on Image Processing (TIP), 2015.
Multiple cues: (1) Multiple features; (2) Prior knowledge automatically extracted
Video Face Clustering
Applications
[IEEE TIP’15] Xiaochun Cao, Changqing Zhang (张长青) *, Chengju Zhou, Huazhu Fu, and Hassan Foroosh, Video Face Clustering via Constrained Sparse Representation and Multi-View Spectral Clustering, IEEE Transactions on Image Processing (TIP), 2015.