a modularity based spectral graph analysis
play

A modularity-based spectral graph analysis Dario Fasino (Udine), - PowerPoint PPT Presentation

A modularity-based spectral graph analysis Dario Fasino (Udine), Francesco Tudisco (Roma TV) Cagliari, VDM60 D. Fasino, F. Tudisco Modularity-based spectral graph analysis 1/ 18 Introduction Graphs and networks A complex network is a


  1. A modularity-based spectral graph analysis Dario Fasino (Udine), Francesco Tudisco (Roma TV) Cagliari, VDM60 D. Fasino, F. Tudisco Modularity-based spectral graph analysis 1/ 18

  2. Introduction — Graphs and networks A complex network is a (di-)graph found in real world. Figure: Small complex networks: dolphins , USAir97 , Householder93 . D. Fasino, F. Tudisco Modularity-based spectral graph analysis 2/ 18

  3. Introduction — Graphs and networks A complex network is a (di-)graph found in real world. Outline: 1 Elements of algebraic graph theory 2 Two problems on complex networks: graph partitioning — Laplacian matrices 1 community detection — modularity matrices 2 3 Spectral analysis of modularity matrices 4 Complements, comments, conclusion D. F., F. Tudisco. An algebraic analysis of the graph modularity. Preprint (2013). D. Fasino, F. Tudisco Modularity-based spectral graph analysis 2/ 18

  4. Introduction — Graphs and networks A complex network is a (di-)graph found in real world. Notations: G = ( V , E ): (unoriented) graph, vertices V = { 1 , . . . , n } , edges E ⊆ V × V A subset S ⊆ V induces a subgraph, having edge set E ( S ) and edge boundary ∂ S if S ⊆ V then ¯ S denotes complement, | S | denotes cardinality the degree of vertex i is d i = deg ( i ). The volume of S ⊆ V is vol S = � i ∈ S d i ; vol S = 2 | E ( S ) | + | ∂ S | . D. Fasino, F. Tudisco Modularity-based spectral graph analysis 2/ 18

  5. Introduction — Graphs and networks A few special matrices are usually associated to a graph 4 G : the adjacency matrix A and G = 1 the graph Laplacian 3 L = Diag ( d 1 , . . . , d n ) − A : 2       3 0 1 1 1 3 − 1 − 1 − 1 2 1 0 1 0 − 1 2 − 1 0       d = A = L =       2 1 1 0 0 − 1 − 1 2 0       1 1 0 0 1 − 1 0 0 1 M. Fiedler. Note: L 1 = 0. Algebraic connectivity of graphs. Czech. Math. J., 23 (1973), 298–305. D. Fasino, F. Tudisco Modularity-based spectral graph analysis 3/ 18

  6. Graph partitioning Graph partitioning problem Find a partitioning of the vertices into clusters, which minimizes the total weight (e.g., number) of intercluster edges. Number and size of subsets are (roughly, at least) fixed; most familiar quality measure of a cut { S , ¯ S } : | ∂ S | h ( S ) = S |} , conductance of S min {| S | , | ¯ Minimize h ( S ) � NP-hard � spectral techniques Let 1 S denote the characteristic vector of S . Then | ∂ S | = 1 T S L 1 S , | S | = 1 T S 1 S . D. Fasino, F. Tudisco Modularity-based spectral graph analysis 4/ 18

  7. Graph partitioning Graph partitioning problem Find a partitioning of the vertices into clusters, which minimizes the total weight (e.g., number) of intercluster edges. Spectral partitioning technique Instead of min S h ( S ) solve v T Lv min v T v v T 1 =0 Then set S = { i : v i ≥ σ } . The solution is the Fiedler vector: Lf = a ( G ) f a ( G ) = smallest positive e.value of L = algebraic connectivity of G . D. Fasino, F. Tudisco Modularity-based spectral graph analysis 4/ 18

  8. Level sets of Fiedler vectors Theorem Let G be a connected graph with a ( G ) simple eigenvalue, Lf = a ( G ) f . For σ ≤ 0 , let S = { i : f i ≥ σ } . Then S induces a connected subgraph. Figure: Spectral bisection of the dolphins network. Left: Fiedler vector. Right: level sets, σ = 0. D. Fasino, F. Tudisco Modularity-based spectral graph analysis 5/ 18

  9. Level sets of Fiedler vectors Theorem Let G be a connected graph with a ( G ) simple eigenvalue, Lf = a ( G ) f . For σ ≤ 0 , let S = { i : f i ≥ σ } . Then S induces a connected subgraph. More generally, if λ i ( L ) is simple and σ = 0 then the connected components of S and ¯ S are no more than i + 1. Analogous results hold also for Schr¨ odinger operators on weighted graphs, i.e., Diag ( v ) − A . Davies, Gladwell, Leydold, Stadler. Discrete nodal domain theorems. Lin. Alg. Appl. , 336 (2001), 51–60. D. Fasino, F. Tudisco Modularity-based spectral graph analysis 5/ 18

  10. Community detection How to partition a graph into “communities”? Many answers available; trade-off betwen intercluster edges (many) and intracluster edges (few) number and size of clusters are not a priori specified. Idea [Newman, Girvan 06] “ A good division of a network into communities (...) is one in which there are fewer than expected edges between communities. ” M. Newman, M. Girvan. Finding and evaluating community structure in networks. Phys. Rev. E , 69 (2006), 026113. D. Fasino, F. Tudisco Modularity-based spectral graph analysis 6/ 18

  11. Community detection — modularity We need a null model to define the expected number of edges in a subgraph; e.g., the Erd¨ os-Renyi random graph model. A better choice: Chung-Lu random graph model Fixed integers d 1 , . . . , d n , the probability that the edge ( i , j ) exists is d i d j / � k d k . Accordingly, the expected number of edges supported in S ⊆ V is = ( vol S ) 2 d i d j � vol G . � k d k i , j ∈ S The difference between that number and | E ( S ) | is a quality measure for S as a “community”. D. Fasino, F. Tudisco Modularity-based spectral graph analysis 7/ 18

  12. Community detection — modularity Modularity of S ⊆ V : Q ( S ) = 2 | E ( S ) | − ( vol S ) 2 vol G = vol S vol ¯ S − | ∂ S | = Q (¯ S ) . vol G What is a “community”? A community is a subset S ⊂ V having positive modularity. Introduce the modularity matrix M = A − dd T / vol G . Then, Q ( S ) = 1 T S M 1 S . Indeed, 1 T S A 1 S = 2 | E ( S ) | and 1 T S d = vol S . Note: M 1 = 0. D. Fasino, F. Tudisco Modularity-based spectral graph analysis 8/ 18

  13. Algebraic modularity Community detection problem (simplified: just one cluster) Find S ⊂ V which maximizes the modularity Q ( S ). Instead of max S ⊂ V Q ( S ) (NP-hard) solve v T Mv m ( G ) := max v T v v T 1 =0 Then set S = { i : v i ≥ σ } . By far, the most popular and successful heuristic for community detection [Newman’06, Fortunato’10, VanDooren+’12. . . ] The solution is Mv = m ( G ) v m ( G ) = algebraic modularity of G . Very informally, v = Newman vector. v T 1 = 0. D. Fasino, F. Tudisco Modularity-based spectral graph analysis 9/ 18

  14. Spectral properties of M S 1 S )). Owing to Q ( S ) = Q (¯ Q ( S ) = 1 T S M 1 S = trace ( M ( 1 T S ), Q ( S ) = α Q ( S ) + (1 − α ) Q (¯ S ) = trace ( MB ) for all 0 ≤ α ≤ 1, where B = α 1 S 1 T S 1 T S + (1 − α ) 1 ¯ S . ¯ Let α = | ¯ S | / n . From Wieland-Hoffman theorem, Q ( S ) ≤ λ 1 ( M ) λ 1 ( B ) + λ 2 ( M ) λ 2 ( B ) = ( λ 1 ( M ) + λ 2 ( M )) | S || ¯ S | n ≤ λ 1 ( M ) n 4 , independently of S . Owing to M 1 = 0 we can replace λ 1 ( M ) by m ( G ). D. Fasino, F. Tudisco Modularity-based spectral graph analysis 10/ 18

  15. Spectral properties of M Let G 0 = ( V , V × V , ω 0 ) the null model weighted graph with ω 0 ( i , j ) = d i d j / vol G , and let L 0 be its Laplacian: � − ω 0 ( i , j ) i � = j ( L 0 ) ij = � k � = i ω 0 ( i , k ) i = j . Then, L 0 = D − dd T / vol G . Moreover, M = A − D + D − dd T / vol G = L 0 − L . We also obtain: d min − a ( G ) ≤ a ( G 0 ) − a ( G ) ≤ m ( G ) ≤ d max − a ( G ) . In particular, m ( G ) ≥ − d min / ( n − 1), optimal bound. D. Fasino, F. Tudisco Modularity-based spectral graph analysis 11/ 18

  16. � � � � Level sets of Newman vectors Theorem Let Mv = m ( G ) v with m ( G ) simple eigenvalue and d T v ≥ 0. For all σ ≤ 0, S = { i : v i ≥ σ } induces a connected subgraph. Proof (sketch, σ = 0). m ( G ) v = Mv = Av − ( d T v / vol G ) d ≤ Av . By contradiction, assume that S consists of 2 disjoint subgraphs: Reorder entries of v according to partitioning: ¯ S v 3 G v 1 v 2 S D. Fasino, F. Tudisco Modularity-based spectral graph analysis 12/ 18

  17. Level sets of Newman vectors Theorem Let Mv = m ( G ) v with m ( G ) simple eigenvalue and d T v ≥ 0. For all σ ≤ 0, S = { i : v i ≥ σ } induces a connected subgraph. Proof (sketch, σ = 0). m ( G ) v = Mv = Av − ( d T v / vol G ) d ≤ Av . By contradiction, assume that S consists of 2 disjoint subgraphs: Reorder and partition consistently A , M , v . Then,         m ( G ) v 1 A 11 ∗ v 1 A 11 v 1  ≤  ≤  . m ( G ) v 2 A 22 ∗ v 2 A 22 v 2      m ( G ) v 3 ∗ ∗ ∗ v 3 ∗ By nonnegativity and eigenvalue interlacing, A has at least 2 eigenvalues > m ( G ), absurd. � D. Fasino, F. Tudisco Modularity-based spectral graph analysis 12/ 18

  18. Nodal domains: Examples The dolphins network. Left: Fiedler vector. Right: Newman vector. A small graph. Left: Fiedler vector. Right: Newman vector. D. Fasino, F. Tudisco Modularity-based spectral graph analysis 13/ 18

  19. The Householder93 collaboration graph Figure: Spectral distribution of M Figure: Community detection in Householder93 . D. Fasino, F. Tudisco Modularity-based spectral graph analysis 14/ 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend