graph refjnement for clustering
play

Graph Refjnement for Clustering Zhenyue Zhang Zhejiang University - PowerPoint PPT Presentation

. . . . . . . . . . . . . . . . . Graph Refjnement for Clustering Zhenyue Zhang Zhejiang University Jointed work with Limin Li, Jiayun Mao, Zheng Zhai . . . . . . . . . . . . . . . . . . . . . . . MLA 2017


  1. . . . . . . . . . . . . . . . . . Graph Refjnement for Clustering Zhenyue Zhang Zhejiang University Jointed work with Limin Li, Jiayun Mao, Zheng Zhai . . . . . . . . . . . . . . . . . . . . . . . MLA 2017 · Beijing Jiaotong University

  2. . . . . . . . . . . . . . . . . . . Graphs The roles of graphs . . . . . . . . . . . . . . . . . . . . . . • feature selection, dimensionality reduction, • clustering • smart messaging as Allo (Google) • lot of applications ...

  3. . . . . . . . . . . . . . . . . . . Many graph-based methods sufger from graph noise because of . . . . . . . . . . . . . . . . . . . . . . • incorrect connections or weights, • missing information • noisy data if graphs are constructed from data points • unsuitable measurement used for graph construction • confmicting information from multi-view data sets, view distortion • difgerent magnitude, neighborhoods, distribution, and noise process • difgerent view-specifjc graphs

  4. . . . . . . . . . . . . . . . . . . Graph modifjcation . . . . . . . . . . . . . . . . . . . . . . • data cleaning • graph approximation in a special form • graph fusion for multi-view learning • graph coarsening

  5. . . . . . . . . . . . . . . . . . . We will talk about the three issues for graph modifjcation: . . . . . . . . . . . . . . . . . . learning (SSC, LRR), multi-view learning (CRCS,MKkC) . . . . • Uniform feature selection/projection for multi-view data • UMCD for multiple dissimilarity matrices/kernels • UCA for multiple similarity matrices • Uniform neighborhood graph from multi-view data • Construct a uniform sparse graph for all views • Modify view-specifjc graphs for multi-view learning methods • Graph refjnement • Improve methods in manifold learning (LLE, LE, LPP), subspace

  6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part I: Uniform Feature Selection

  7. . . . . . . . . . . . . . . . . . Multi-view observations (column vectors) x v cellular environments, or the status of its somatic mutation in difgerent tumors. . . . . . . . . . . . . . . . . . . . . . . . i ∈ X v ⊂ R d v , i = 1 , · · · , n , v = 1 , · · · , m • Webpages: contents or hyperlinks of contents • Multiple-language environment: documents in multiple languages • Images: pixels or text captions (labels) • Publications: contents (key words), and citations • Gene representations: gene sequences, expressions in difgerent • ....

  8. . . . . . . . . . . . . . . . . View distortion Question: 1. Can we simulate view distortion in term of latent ”uniform true features”? 2. How to retrieve the features from multiple noisy graphs . . . . . . . . . . . . . . . . . . . . . . . . approximately?

  9. . . . . . . . . . . . . . . . . . Given observed vectors x v distortion as a nonlinear mapping of the noisy features, x v . . . . . . . . . . . . . . . . . . . . . . . i ∈ X v ⊂ R d v in view v , we model the view i = 1 , · · · , n , i = f v ( y i , ϵ v i ) , • f v : view-specifjc distortion function • { y i } : uniform latent features in a low-dimensional space • { ϵ v i } : view-specifjc noise vectors

  10. . . . . . . . . . . . . . . . . A simple form x v x v g v : a nonlinear mapping, G v : an affjne transformation . x v . v . . . . . . . . . . . . . . . . . . . . . . i = g v ( G v y i + ϵ v i ) , y i ∈ R d , or i = ( ϕ v ◦ g v )( G v y i + ϵ v i ) 2 2 2 1.8 1.8 1.8 1 1.6 1.6 1.6 0.8 1.4 1.4 1.4 0.6 1.2 1.2 1.2 0.4 1 1 1 0.2 0 0.8 0.8 0.8 − 0.2 0.6 0.6 0.6 1 0.4 0.4 0.4 0.8 0.5 0.6 0.2 0.2 0.2 0.4 0 0.2 0 0 0 0 − 0.2 0 0.5 1 1.5 2 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 3.5 4 Figure: Left: Intact 3D samples { y j } . The right three { x v j } j = exp( G v y j + ϵ j ) , G v = DQ T

  11. . . . . . . . . . . . . . . . . Two models v F (1) n ee T . . . . . . . . . . . . . . . . . . . . . . . . Model I: UMDS for multiple squared dissimilarity matrices { D v } ∑ min ∥ A v − Y T W v Y ∥ 2 F / ∥ A v ∥ 2 YY T = I , { W v } The input matrices { A v } could be • A v = − 1 2 HD v H for a squared dissimilarity matrix D v in view v , where H = I − 1 • A v = HK v H for a kernel K v

  12. • S v v S B v , B v : view-deviation • Factorization: S • minimize the deviation blocks B v . UDU T , . . . . . v v (2) Basic idea: B v . U U B v 11 B v 12 B v 21 B v 22 U U T 12 , B v 21 , and B v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 . Model II: UCA for multiple similarity matrices { S v } : ∑ τ − 2 � � max F . � U T S v U � 2 U T U = I where S v is a view-specifjc similarity matrix in view v .

  13. . v . . . . . . . . . . v . (2) Basic idea: 11 B v 12 B v 21 B v 22 12 , B v 21 , and B v . . . . . . . . . . . . . . . . . 22 . . . . . . . . . . . Model II: UCA for multiple similarity matrices { S v } : ∑ τ − 2 � � max F . � U T S v U � 2 U T U = I where S v is a view-specifjc similarity matrix in view v . • S v = τ v ( S + B v ) , B v : view-deviation • Factorization: S = UDU T , ( B v ) B v = ( U , U ⊥ ) ( U , U ⊥ ) T • minimize the deviation blocks B v

  14. eigenvalue problem with C U v A v UU T A v • Eigen-subspace iteration (small scale), or • Subspace extension (large scale) . . . . . . . . . . The uniform model: Equivalences . v (3) KKT condition (fjrst order necessary condition): U solves the nonlinear C U U U U T U I It can be solved by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ � � max F . � U T A v U � 2 U T U = I

  15. • Eigen-subspace iteration (small scale), or • Subspace extension (large scale) . . . . . . . . . . . . . . . Equivalences The uniform model: v (3) It can be solved by . . . . . . . . . . . . . . . . . . . . . . . . . ∑ � � max F . � U T A v U � 2 U T U = I KKT condition (fjrst order necessary condition): U solves the nonlinear eigenvalue problem with C ( U ) = ∑ v A v UU T A v C ( U ) U = U Λ , U T U = I .

  16. . . . . . . . . . . . . . . . . Equivalences The uniform model: v (3) It can be solved by . . . . . . . . . . . . . . . . . . . . . . . . ∑ � � max F . � U T A v U � 2 U T U = I KKT condition (fjrst order necessary condition): U solves the nonlinear eigenvalue problem with C ( U ) = ∑ v A v UU T A v C ( U ) U = U Λ , U T U = I . • Eigen-subspace iteration (small scale), or • Subspace extension (large scale)

  17. . . . . . . . . . . . . . . . . . . Convergence: . . . . . . . . . . . . . . . . . . . . . . (a) { f ( U ℓ ) } is convergent. (b) Any accumulation point U ∗ satisfjes C ∗ U ∗ = U ∗ Λ ∗ (c) If λ d ( C ∗ ) > λ d + 1 ( C ∗ ) , then { P ℓ + 1 − P ℓ } tends to zero. (d) P ℓ → P ∗ if { P ℓ } has an isolated accumulation point P ∗ .

  18. . . . . . . . . . . . . . . . . . Synthetic data x v DQ T j . . . . . . . . . . . . . . . . . . . . . . . ( ) j = exp v y j + ϵ v , y j ∈ R 3 , • Each Q v has two orthonormal columns • D = diag ( 1 , s ) with s ∈ ( 0 , 1 ) : measuring singularity • ε v j ∼ N ( 0 , σ ) , σ : noise level

  19. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . d = 4, σ = 0.084211 d = 4, σ = 0.13684 d = 4, s = 0 d = 4, s = 0.8 1 1 1 1 0.98 0.98 0.98 0.98 0.96 0.96 0.96 0.96 0.94 0.94 0.94 0.94 MDS 0.92 MDS 0.92 0.92 0.92 UMDS UMDS 0.9 0.9 0.9 0.9 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2 s s σ σ d = 6, σ = 0.084211 d = 6, σ = 0.13684 d = 6, s = 0 d = 6, s = 0.8 1 1 1 1 0.98 0.98 0.98 0.98 0.96 0.96 0.96 0.96 0.94 0.94 0.94 0.94 0.92 0.92 0.92 0.92 0.9 0.9 0.9 0.9 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2 s s σ σ

  20. . . . . . . . . . . . . . . . . Real-word data French, German, Spanish, and Italian average of local pixels in a 2 x 3 window across course, project, student, faculty or stafg BBCsports on athletics, cricket, football, rugby, or tennis . . . . . . . . . . . . . . . . . . . . . . . . • News stories in six topics from BBC, Reuters, and Guardian • Reuters Multilingual data: documents over 6 categories written in English, • UCIDigit: hand written digits in Fourier coeffjcient, profjle correlation, and • Webpages on Cornell, Texas, Washington, Wisconsin in content or link, • BBCnews on business, entertainment, politics, sport, tech • Cora: research papers (absence/presence or link) in 7 classes

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend