Graph Refjnement for Clustering Zhenyue Zhang Zhejiang University - PowerPoint PPT Presentation

. . . . . . . . . . . . . . . . . Graph Refjnement for Clustering Zhenyue Zhang Zhejiang University Jointed work with Limin Li, Jiayun Mao, Zheng Zhai . . . . . . . . . . . . . . . . . . . . . . . MLA 2017 · Beijing Jiaotong University

. . . . . . . . . . . . . . . . . . Graphs The roles of graphs . . . . . . . . . . . . . . . . . . . . . . • feature selection, dimensionality reduction, • clustering • smart messaging as Allo (Google) • lot of applications ...

. . . . . . . . . . . . . . . . . . Many graph-based methods sufger from graph noise because of . . . . . . . . . . . . . . . . . . . . . . • incorrect connections or weights, • missing information • noisy data if graphs are constructed from data points • unsuitable measurement used for graph construction • confmicting information from multi-view data sets, view distortion • difgerent magnitude, neighborhoods, distribution, and noise process • difgerent view-specifjc graphs

. . . . . . . . . . . . . . . . . . Graph modifjcation . . . . . . . . . . . . . . . . . . . . . . • data cleaning • graph approximation in a special form • graph fusion for multi-view learning • graph coarsening

. . . . . . . . . . . . . . . . . . We will talk about the three issues for graph modifjcation: . . . . . . . . . . . . . . . . . . learning (SSC, LRR), multi-view learning (CRCS,MKkC) . . . . • Uniform feature selection/projection for multi-view data • UMCD for multiple dissimilarity matrices/kernels • UCA for multiple similarity matrices • Uniform neighborhood graph from multi-view data • Construct a uniform sparse graph for all views • Modify view-specifjc graphs for multi-view learning methods • Graph refjnement • Improve methods in manifold learning (LLE, LE, LPP), subspace

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part I: Uniform Feature Selection

. . . . . . . . . . . . . . . . . Multi-view observations (column vectors) x v cellular environments, or the status of its somatic mutation in difgerent tumors. . . . . . . . . . . . . . . . . . . . . . . . i ∈ X v ⊂ R d v , i = 1 , · · · , n , v = 1 , · · · , m • Webpages: contents or hyperlinks of contents • Multiple-language environment: documents in multiple languages • Images: pixels or text captions (labels) • Publications: contents (key words), and citations • Gene representations: gene sequences, expressions in difgerent • ....

. . . . . . . . . . . . . . . . View distortion Question: 1. Can we simulate view distortion in term of latent ”uniform true features”? 2. How to retrieve the features from multiple noisy graphs . . . . . . . . . . . . . . . . . . . . . . . . approximately?

. . . . . . . . . . . . . . . . . Given observed vectors x v distortion as a nonlinear mapping of the noisy features, x v . . . . . . . . . . . . . . . . . . . . . . . i ∈ X v ⊂ R d v in view v , we model the view i = 1 , · · · , n , i = f v ( y i , ϵ v i ) , • f v : view-specifjc distortion function • { y i } : uniform latent features in a low-dimensional space • { ϵ v i } : view-specifjc noise vectors

. . . . . . . . . . . . . . . . A simple form x v x v g v : a nonlinear mapping, G v : an affjne transformation . x v . v . . . . . . . . . . . . . . . . . . . . . . i = g v ( G v y i + ϵ v i ) , y i ∈ R d , or i = ( ϕ v ◦ g v )( G v y i + ϵ v i ) 2 2 2 1.8 1.8 1.8 1 1.6 1.6 1.6 0.8 1.4 1.4 1.4 0.6 1.2 1.2 1.2 0.4 1 1 1 0.2 0 0.8 0.8 0.8 − 0.2 0.6 0.6 0.6 1 0.4 0.4 0.4 0.8 0.5 0.6 0.2 0.2 0.2 0.4 0 0.2 0 0 0 0 − 0.2 0 0.5 1 1.5 2 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 3.5 4 Figure: Left: Intact 3D samples { y j } . The right three { x v j } j = exp( G v y j + ϵ j ) , G v = DQ T

. . . . . . . . . . . . . . . . Two models v F (1) n ee T . . . . . . . . . . . . . . . . . . . . . . . . Model I: UMDS for multiple squared dissimilarity matrices { D v } ∑ min ∥ A v − Y T W v Y ∥ 2 F / ∥ A v ∥ 2 YY T = I , { W v } The input matrices { A v } could be • A v = − 1 2 HD v H for a squared dissimilarity matrix D v in view v , where H = I − 1 • A v = HK v H for a kernel K v

• S v v S B v , B v : view-deviation • Factorization: S • minimize the deviation blocks B v . UDU T , . . . . . v v (2) Basic idea: B v . U U B v 11 B v 12 B v 21 B v 22 U U T 12 , B v 21 , and B v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 . Model II: UCA for multiple similarity matrices { S v } : ∑ τ − 2 � � max F . � U T S v U � 2 U T U = I where S v is a view-specifjc similarity matrix in view v .

. v . . . . . . . . . . v . (2) Basic idea: 11 B v 12 B v 21 B v 22 12 , B v 21 , and B v . . . . . . . . . . . . . . . . . 22 . . . . . . . . . . . Model II: UCA for multiple similarity matrices { S v } : ∑ τ − 2 � � max F . � U T S v U � 2 U T U = I where S v is a view-specifjc similarity matrix in view v . • S v = τ v ( S + B v ) , B v : view-deviation • Factorization: S = UDU T , ( B v ) B v = ( U , U ⊥ ) ( U , U ⊥ ) T • minimize the deviation blocks B v

eigenvalue problem with C U v A v UU T A v • Eigen-subspace iteration (small scale), or • Subspace extension (large scale) . . . . . . . . . . The uniform model: Equivalences . v (3) KKT condition (fjrst order necessary condition): U solves the nonlinear C U U U U T U I It can be solved by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ � � max F . � U T A v U � 2 U T U = I

• Eigen-subspace iteration (small scale), or • Subspace extension (large scale) . . . . . . . . . . . . . . . Equivalences The uniform model: v (3) It can be solved by . . . . . . . . . . . . . . . . . . . . . . . . . ∑ � � max F . � U T A v U � 2 U T U = I KKT condition (fjrst order necessary condition): U solves the nonlinear eigenvalue problem with C ( U ) = ∑ v A v UU T A v C ( U ) U = U Λ , U T U = I .

. . . . . . . . . . . . . . . . Equivalences The uniform model: v (3) It can be solved by . . . . . . . . . . . . . . . . . . . . . . . . ∑ � � max F . � U T A v U � 2 U T U = I KKT condition (fjrst order necessary condition): U solves the nonlinear eigenvalue problem with C ( U ) = ∑ v A v UU T A v C ( U ) U = U Λ , U T U = I . • Eigen-subspace iteration (small scale), or • Subspace extension (large scale)

. . . . . . . . . . . . . . . . . . Convergence: . . . . . . . . . . . . . . . . . . . . . . (a) { f ( U ℓ ) } is convergent. (b) Any accumulation point U ∗ satisfjes C ∗ U ∗ = U ∗ Λ ∗ (c) If λ d ( C ∗ ) > λ d + 1 ( C ∗ ) , then { P ℓ + 1 − P ℓ } tends to zero. (d) P ℓ → P ∗ if { P ℓ } has an isolated accumulation point P ∗ .

. . . . . . . . . . . . . . . . . Synthetic data x v DQ T j . . . . . . . . . . . . . . . . . . . . . . . ( ) j = exp v y j + ϵ v , y j ∈ R 3 , • Each Q v has two orthonormal columns • D = diag ( 1 , s ) with s ∈ ( 0 , 1 ) : measuring singularity • ε v j ∼ N ( 0 , σ ) , σ : noise level

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . d = 4, σ = 0.084211 d = 4, σ = 0.13684 d = 4, s = 0 d = 4, s = 0.8 1 1 1 1 0.98 0.98 0.98 0.98 0.96 0.96 0.96 0.96 0.94 0.94 0.94 0.94 MDS 0.92 MDS 0.92 0.92 0.92 UMDS UMDS 0.9 0.9 0.9 0.9 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2 s s σ σ d = 6, σ = 0.084211 d = 6, σ = 0.13684 d = 6, s = 0 d = 6, s = 0.8 1 1 1 1 0.98 0.98 0.98 0.98 0.96 0.96 0.96 0.96 0.94 0.94 0.94 0.94 0.92 0.92 0.92 0.92 0.9 0.9 0.9 0.9 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2 s s σ σ

. . . . . . . . . . . . . . . . Real-word data French, German, Spanish, and Italian average of local pixels in a 2 x 3 window across course, project, student, faculty or stafg BBCsports on athletics, cricket, football, rugby, or tennis . . . . . . . . . . . . . . . . . . . . . . . . • News stories in six topics from BBC, Reuters, and Guardian • Reuters Multilingual data: documents over 6 categories written in English, • UCIDigit: hand written digits in Fourier coeffjcient, profjle correlation, and • Webpages on Cornell, Texas, Washington, Wisconsin in content or link, • BBCnews on business, entertainment, politics, sport, tech • Cora: research papers (absence/presence or link) in 7 classes

Graph Refjnement for Clustering Zhenyue Zhang Zhejiang University - PowerPoint PPT Presentation

. . . . . . . . . . . . . . . . . Graph Refjnement for Clustering Zhenyue Zhang Zhejiang University Jointed work with Limin Li, Jiayun Mao, Zheng Zhai . . . . . . . . . . . . . . . . . . . . . . . MLA 2017

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

workin progress The M in CoPaR From Partition Refjnement to Minimization Hans-Peter Deifel

Generic Partition Refjnement and Weighted Tree Automata Hans-Peter Deifel , Stefan Milius, Lutz

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Graph Clustering Why graph clustering is useful? Distance matrices are graphs as useful as

PAC-Bayesian Analysis of Co-clustering, Graph Clustering and Pairwise Clustering Yevgeny Seldin

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Finding Clusters Types of Clustering Approaches: Linkage Based, e.g. Hierarchical Clustering

Clustering Hierarchical clustering and k-mean clustering Genome 373 Genomic Informatics

Cl Clustering t i A Categorization of Major Clustering Methods Partitioning Methods

Clustering Hierarchical clustering, k-mean clustering Genome 559: Introduction to Statistical and

CSCE 478/878 Lecture 8: Stephen Scott Clustering Introduction Outline Clustering Stephen

Clustering and Dimensionality Reduction Preview Clustering K -means clustering

Clustering kMeans, Expectation Maximization, Self-Organizing Maps Outline K-means

Elementary Graph Theory & Matrix Algebra Steve Borgatti Drawn from: 2008 LINKS Center

standing still: a shared aesthetic nick davies - sasha kingston - jayne walker in partnership

+ Event Detection Automatic Extraction of Archaeological Events from Text Wenbin Li

Relax Into Enlightenment W E L C O M E S H A U M B R A T H E T R A N S H U M A N S E R I E S

Disjoint Sets with Arrays Data Structures and Algorithms CSE 373 SP 18 - KASEY CHAMPION 1 Warm

Next-generation ALK inhibitors for ALK-positive NSCLC Geoffrey R. Oxnard, MD Damon Runyon-Gordon

Second Quarter 2013 Results 31 July 2013 Disclaimer Figures included in this presentation are

On the proper use of phylogenetic information in typology Gerhard Jger Tbingen University

Graph Refjnement for Clustering Zhenyue Zhang Zhejiang University - PowerPoint PPT Presentation

. . . . . . . . . . . . . . . . . Graph Refjnement for Clustering Zhenyue Zhang Zhejiang University Jointed work with Limin Li, Jiayun Mao, Zheng Zhai . . . . . . . . . . . . . . . . . . . . . . . MLA 2017

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

workin progress The M in CoPaR From Partition Refjnement to Minimization Hans-Peter Deifel

Generic Partition Refjnement and Weighted Tree Automata Hans-Peter Deifel , Stefan Milius, Lutz

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Graph Clustering Why graph clustering is useful? Distance matrices are graphs as useful as

PAC-Bayesian Analysis of Co-clustering, Graph Clustering and Pairwise Clustering Yevgeny Seldin

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Finding Clusters Types of Clustering Approaches: Linkage Based, e.g. Hierarchical Clustering

Clustering Hierarchical clustering and k-mean clustering Genome 373 Genomic Informatics

Cl Clustering t i A Categorization of Major Clustering Methods Partitioning Methods

Clustering Hierarchical clustering, k-mean clustering Genome 559: Introduction to Statistical and

CSCE 478/878 Lecture 8: Stephen Scott Clustering Introduction Outline Clustering Stephen

Clustering and Dimensionality Reduction Preview Clustering K -means clustering

Clustering kMeans, Expectation Maximization, Self-Organizing Maps Outline K-means

Elementary Graph Theory &amp; Matrix Algebra Steve Borgatti Drawn from: 2008 LINKS Center

standing still: a shared aesthetic nick davies - sasha kingston - jayne walker in partnership

+ Event Detection Automatic Extraction of Archaeological Events from Text Wenbin Li

Relax Into Enlightenment W E L C O M E S H A U M B R A T H E T R A N S H U M A N S E R I E S

Disjoint Sets with Arrays Data Structures and Algorithms CSE 373 SP 18 - KASEY CHAMPION 1 Warm

Next-generation ALK inhibitors for ALK-positive NSCLC Geoffrey R. Oxnard, MD Damon Runyon-Gordon

Second Quarter 2013 Results 31 July 2013 Disclaimer Figures included in this presentation are

On the proper use of phylogenetic information in typology Gerhard Jger Tbingen University

Elementary Graph Theory & Matrix Algebra Steve Borgatti Drawn from: 2008 LINKS Center