Co-clustering documents and words using Bipartite Spectral Graph - PowerPoint PPT Presentation

Introduction Review of Spectral Graph Partitioning Bipartite Extension Summary Co-clustering documents and words using Bipartite Spectral Graph Partitioning Inderjit S. Dhillon Presenter: Lei Tang 16th April 2006 Inderjit S. Dhillon Presenter: Lei Tang Co-clustering documents and words using Biparti

Introduction Problem Review of Spectral Graph Partitioning Bipartite Graph Model Bipartite Extension Duality of word and document clustering Summary The past work focus on clustering on one axis(either document or word) Document Clustering: Agglomerative clustering, k-means, LSA, self-organizing maps, multidimensional scaling etc. Word Clustering: distributional clustering, information bottleneck etc. Co-clustering simultaneous cluster words and documents! Inderjit S. Dhillon Presenter: Lei Tang Co-clustering documents and words using Biparti

Introduction Problem Review of Spectral Graph Partitioning Bipartite Graph Model Bipartite Extension Duality of word and document clustering Summary � E ij , if there is an edge { i, j } Adjacency Matrix M ij = 0 , otherwise � Cut ( V 1 , V 2 ) = M ij i ∈ V 1 ,j ∈ V 2 G = ( D, W, E ) where D : docs; W : words; E : edges representing a word occurring in a doc. The adjacency matrix: � 0 A | D |×| W | � M = A T 0 No links between documents; No links between words Inderjit S. Dhillon Presenter: Lei Tang Co-clustering documents and words using Biparti

Introduction Problem Review of Spectral Graph Partitioning Bipartite Graph Model Bipartite Extension Duality of word and document clustering Summary Disjoint document clusters: D 1 , D 2 , · · · , D k Disjoint word clusters: W 1 , W 2 , · · · , W k Idea: Document clusters determine word clusters; word clusters in turn determine (better) document clusters. (seems familiar? recall HITS: Authorities/ Hub Computation) The “best” partition is the k-way cut of the bipartite graph. cut ( W 1 ∪ D 1 , · · · , W k ∪ D k ) = V 1 , ··· ,V k cut ( V 1 , · · · , V k ) min Solution: Spectral Graph Partition Inderjit S. Dhillon Presenter: Lei Tang Co-clustering documents and words using Biparti

Introduction Minimum Cut Review of Spectral Graph Partitioning Weighted Cut Bipartite Extension Laplacian matrix Summary Eigenvectors 2-partition problem: Partition a graph (not necessarily bipartite) into two parts with minimum between-cluster weights. The above problem actually tries to find a minimum cut to partition the graph into two parts. Drawbacks: Always find unbalanced cut. Weight of cut is directly proportional to the number of edges in the cut. Inderjit S. Dhillon Presenter: Lei Tang Co-clustering documents and words using Biparti

Introduction Minimum Cut Review of Spectral Graph Partitioning Weighted Cut Bipartite Extension Laplacian matrix Summary Eigenvectors An effective heuristic: WeightedCut ( A, B ) = cut ( A, B ) weight ( A ) + cut ( A, B ) weight ( B ) If weight ( A ) = | A | , then Ratio-cut ; If weight ( A ) = cut ( A, B ) + within ( A ), then Normalized-cut . cut ( A, B ) = w (3 , 4) + w (2 , 4) + w (2 , 5) weight ( A ) = w (1 , 3) + w (1 , 2) + w (2 , 3) + w (3 , 4) + w (2 , 4) + w (2 , 5) weight ( B ) = w (4 , 5) + w (3 , 4) + w (2 , 4) + w (2 , 5) Inderjit S. Dhillon Presenter: Lei Tang Co-clustering documents and words using Biparti

Introduction Minimum Cut Review of Spectral Graph Partitioning Weighted Cut Bipartite Extension Laplacian matrix Summary Eigenvectors Solution Finding the weighted cut boils down to solve a generalized eigenvalue problem: Lz = λWz where L is Laplacian matrix and W is a diagonal weight matrix and z denotes the cut. Inderjit S. Dhillon Presenter: Lei Tang Co-clustering documents and words using Biparti

Introduction Minimum Cut Review of Spectral Graph Partitioning Weighted Cut Bipartite Extension Laplacian matrix Summary Eigenvectors Laplacian Matrix for G ( V, E ):  � k E i k, i = j  L ij = − E ij , i � = jand there is an edge { i, j } 0 otherwise  Properties L = D − M . M is the adjacency matrix, D is the diagonal “degree” matrix with D ii = � k E ik L = I G I T G where I G is the | V | × | E | incidence matrix. For edge (i,j), I G is 0 except for the i-th and j-th entry which are � � E ij and − E ij respectively. L ˆ 1 = 0 x T Lx = � i,j ∈ E E ij ( x i − x j ) 1 ) T L ( αx + β ˆ 1 ) = α 2 x T Lx . ( αx + β ˆ Inderjit S. Dhillon Presenter: Lei Tang Co-clustering documents and words using Biparti

Introduction Minimum Cut Review of Spectral Graph Partitioning Weighted Cut Bipartite Extension Laplacian matrix Summary Eigenvectors Let p be a vector to denote a cut: � +1 , i ∈ A So p i = − 1 , i ∈ B p T Lp = E ij ( p i − p j ) 2 = 4 cut ( A, B ) � i,j ∈ E Introduce another vector q s.t.  � weight ( B ) + weight ( A ) , i ∈ A  q i = � weight ( A ) − weight ( B ) , i ∈ B  w A + w B p + w B − w A ˆ Then q = 2 √ w A w B 2 √ w A w B 1 ( w A + w B ) 2 q T Lq p T Lp L ˆ = ( as 1 = 0) 4 w A w B ( w A + w B ) 2 = · cut ( A, B ) w A w B Inderjit S. Dhillon Presenter: Lei Tang Co-clustering documents and words using Biparti

Introduction Minimum Cut Review of Spectral Graph Partitioning Weighted Cut Bipartite Extension Laplacian matrix Summary Eigenvectors Property of q q T We = 0 q T Wq = weight ( V ) = w A + w B Then ( w A + w B ) 2 q T Lq · cut ( A, B ) w A w B = q T Wq w A + w B w A + w B = · cut ( A, B ) w A w B weight ( A ) + cut ( A, B ) cut ( A, B ) = weight ( B ) = WeightedCut ( A, B ) Inderjit S. Dhillon Presenter: Lei Tang Co-clustering documents and words using Biparti

Co-clustering documents and words using Bipartite Spectral Graph - PowerPoint PPT Presentation

Introduction Review of Spectral Graph Partitioning Bipartite Extension Summary Co-clustering documents and words using Bipartite Spectral Graph Partitioning Inderjit S. Dhillon Presenter: Lei Tang 16th April 2006 Inderjit S. Dhillon

The Bipartite Matching Problem Math 482, Lecture 21 Misha Lavrov March 25, 2020 Bipartite graph

The Bipartite Matching Problem II Math 482, Lecture 22 Misha Lavrov March 27, 2020 Bipartite

Synchronous and asynchronous clusterings Matthieu Durut September 20, 2012 Matthieu Durut

On Using Class-Labels in Evaluation of Clusterings Ines Frber Stephan Gnnemann

Sources for this lecture 3. Matching in bipartite and general graphs The material for this

Bipartite Vertex Cover Mika Gs University of Toronto & HIIT Jukka Suomela University of

Matching Bipartite Matching Input Given a (undirected) graph G = ( V , E ) Input Given a bipartite

7.5 Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E

Alternative Clusterings: Current Progress and Open Challenges James Bailey Department of

On the cost of essentially fair clusterings Ioana Bercea, Martin Gro, Samir Khuller, Aounon

Bipartite Graphs and their Idempotent Polymorphisms Ross Willard University of Waterloo AMS

Reduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs Alessandro Epasto

On bipartite Q -polynomial distance-regular graphs with c 2 2 Stefko Miklavi c, Safet

Regularity and Grbner bases of the Rees algebra of edge ideals of bipartite graphs Yairon Cid

CPSC 490: Problem Solving in Computer Science A bipartite graph is: and Y . A graph with no

Perfect matchings in O ( n log n ) time in regular bipartite graphs -Ashish Goel, Michael

Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications Thilina Gunarathne,

Presenter: Amen Hussain Segmental Evaluation Diagnostic Rhyme Test Modified Rhyme Test

Yun Raymond Fu Assistant Professor Electrical and Computer Engineering (ECE), COE College of

The perceived impact of external evaluation: the organisation vs the individual Riin Seema, Maiki

Data Fusion at Scale Markus De Shon, Ph.D. Hive Data, LLC Situation awareness Situation

Study of / i /

Resource Efficiency and Circular Economy Approaches towards an Inclusive and Sustainable

United Nations Expert Group Meeting New York 15 17 May 2012 Good Practices in Family Policy

Co-clustering documents and words using Bipartite Spectral Graph - PowerPoint PPT Presentation

Introduction Review of Spectral Graph Partitioning Bipartite Extension Summary Co-clustering documents and words using Bipartite Spectral Graph Partitioning Inderjit S. Dhillon Presenter: Lei Tang 16th April 2006 Inderjit S. Dhillon

The Bipartite Matching Problem Math 482, Lecture 21 Misha Lavrov March 25, 2020 Bipartite graph

The Bipartite Matching Problem II Math 482, Lecture 22 Misha Lavrov March 27, 2020 Bipartite

Synchronous and asynchronous clusterings Matthieu Durut September 20, 2012 Matthieu Durut

On Using Class-Labels in Evaluation of Clusterings Ines Frber Stephan Gnnemann

Sources for this lecture 3. Matching in bipartite and general graphs The material for this

Bipartite Vertex Cover Mika Gs University of Toronto &amp; HIIT Jukka Suomela University of

Matching Bipartite Matching Input Given a (undirected) graph G = ( V , E ) Input Given a bipartite

7.5 Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E

Alternative Clusterings: Current Progress and Open Challenges James Bailey Department of

On the cost of essentially fair clusterings Ioana Bercea, Martin Gro, Samir Khuller, Aounon

Bipartite Graphs and their Idempotent Polymorphisms Ross Willard University of Waterloo AMS

Reduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs Alessandro Epasto

On bipartite Q -polynomial distance-regular graphs with c 2 2 Stefko Miklavi c, Safet

Regularity and Grbner bases of the Rees algebra of edge ideals of bipartite graphs Yairon Cid

CPSC 490: Problem Solving in Computer Science A bipartite graph is: and Y . A graph with no

Perfect matchings in O ( n log n ) time in regular bipartite graphs -Ashish Goel, Michael

Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications Thilina Gunarathne,

Presenter: Amen Hussain Segmental Evaluation Diagnostic Rhyme Test Modified Rhyme Test

Yun Raymond Fu Assistant Professor Electrical and Computer Engineering (ECE), COE College of

The perceived impact of external evaluation: the organisation vs the individual Riin Seema, Maiki

Data Fusion at Scale Markus De Shon, Ph.D. Hive Data, LLC Situation awareness Situation

Study of / i /

Resource Efficiency and Circular Economy Approaches towards an Inclusive and Sustainable

United Nations Expert Group Meeting New York 15 17 May 2012 Good Practices in Family Policy

Bipartite Vertex Cover Mika Gs University of Toronto & HIIT Jukka Suomela University of